PCf^ 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent ClassificatioD ^ : 

C12N 15/31, C07K 14/315, 16/12, C12Q 
1/68 



A2 



(11) International Publication Number: 
(43) International Publication Date: 



WO 98/18931 

7 May 1998(07.05.98) 



(21) International Application Number: PCr/US97/ 19588 

(22) International FSKng Date: 30 October 1997 (30.10.97) 



(30) Priority Data: 
60/029.960 



31 October 1996 (31.10.96) 



US 



(71) Applicant (for all designated States except US): HUMAN 

GENOME SCIENCES. INC. lUS/US]; 9410 Key West 
Avenue, Rockville. MD 20850 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): KUNSCH, Chaiies. A. 
[US/US]; 2398B Dunwoody Crossing. Atlanta, GA 30338 
(US). CHOI, Gil, H. [KR/US]; 1 1429 Potomac Oaks Drive. 
Rockville. MD 20850 (US). DILLON. Patrick. J. [US/US]; 
1055 Snipe Court, Carisbad, CA 92009 (US). ROSEN, 
Craig. A. [US/US]; 22400 Rolling Hill Road. Laytonsville. 
MD 20882 (US). BARASH, Steven. C. [US/US]; 582 Col- 
lege Parkway #303, Rockville, MD 20850 (US). FAN- 
NON, Michael [US/US]; 13501 Rippling Brook Drive, Sil- 
ver Spring, MD 20850 (US). DOUGHERTY, Brian, A. 
[US/US]; 708 Meadow Field Court, Mount Airy. MD 21771 
(US). 



(74) Agents: BROOKES, A., Anders et al.; Human Genome 
Sciences, Inc., 9410 Key West Avenue, Rockville, MD 
20850 (US). 



(81) Designated States: AL. AM. AT. AU. AZ. BA. BB. EG. BR. 
BY. CA, CH. CN, CU. CZ, DE. DK. EE, ES. FI. GB. GE. 
GH. HU. ID. IL. IS. JP, KE. KG. KP. KR. KZ, LC. LK. 
LR, LS, LT. LU. LV. MD, MG, MK. MN, MW, MX, NO, 
NZ. PL, PT, RO, RU. SD. SE. SG. SI. SK, SL, Tl, TM. TR, 
rr, UA. UG, US. UZ. VN. YU, ZW, ARIPO patent (GH, 
KE, LS. MW, SD, SZ, UG. ZW), Eurasian patent (AM, AZ, 
BY. KG. KZ, MD, RU, TJ. TM), European patent (AT, BE, 
CH, DE, DK, ES. FI, FR. GB, GR, IE. IT. LU. MC. NL, 
PT, SE), OAPI patent (BF, BJ, CF, CG, CI, CM. GA. GN, 
ML, MR. NE. SN. TD. TO). 



PubUshed 

Without international search report and to be republished 
upon receipt of that report. 



(54) Titie: STREPTOCOCCUS PNEUMONIAE POLYNUCLEOTIDES AND SEQUENCES 



Computer System 102 

\ 




116 



Removable Storage 
Medkim- 



(57) Abstract 

The present invention provides polynucleotide sequences of the genome of Streptococcus pneumoniae ^ polypeptide sequences encoded 
by the polynucleotide sequences, corresponding polynucleotides and polypeptides, vectors and hosts comprising the polynucleotides, and 
assays and other uses thereof. The present invention further provides polynucleotide and polypeptide sequence information stored on 
computer readable media, and computer-based systems and metiiods which facilitate its use. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the fiont pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


Fl 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


OA 


Bosnia and Hozegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


I^menistan 


BF 


Buitina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


zw 


Zimbabwe 


CI 


C6te d'l voire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






cu 


Cuba 


KZ 


Kazalcstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint [.ucia 


RU 


Russian Federation 






DE 


Cennany 


LI 


Liechtenstein 


SD 


Sudan 






DK 

EE 


Denmark 
Estonia 


LK 
LR 


Sri Lanica 
Liberia 


SB 
SG 


Sweden 
Singuwie 







wo 98/18931 



PCT/US97/19588 



Streptococcus pneumoniae Polynucleotides and Sequences 
FIELD OF THE INVENTION 

5 The present invention relates to the field of molecular biology.' In 

particular, it relates to, among other things, nucleotide sequences of Streptococcus 
pneumoniae, contigs, ORFs, fragments, probes, primers and related 
polynucleotides thereof, peptides and polypeptides encoded by the sequences, and 
uses of the polynucleotides and sequences thereof, such as in fermentation, 
10 polypeptide production, assays and pharmaceutical development, among others. 

BACKGROUND OF THE INVENTION 

Streptococcus pneumoniae has been one of die most extensively studied 

15 microorganisms since its first isoladon in 1881. It was the object of many 
investigations that led to important scientific discoveries. In 1928, Griffith 
observed that when heat-killed encapsulated pneumococci and live strains 
conslitutively lacking any capsule were concomitantly injected into mice, die 
nonencapsulated could be converted into encapsulated pneumococci with the same 

20 capsular type as the heat-killed strain. Years. later, the nature of this "transforming 
principle," or carrier of genetic information, was shown to be DNA. (Avery, 0!T., 
etaLJ. Exp. Med,, 79:137-157 (1944)). 

In spite of the vast number of publicadons on 5. pneumoniae many 
questions about its virulence are sdll unanswered, and this pathogen remains a 

25 major causative agent of serious human disease, especially community-acquired 
pneumonia. (Johnston, R.B., et aL, Rev. Infect. Dis. 7J(Suppl. 6):S509-517 
(1991)). In addition, in developing countries, the pneumococcus is responsible for 
the death of a large number of children under the age of 5 years from pneumococcal 
pneumonia. The incidence of pneumococcal disease is highest in infants under 2 

30 years of age and in people over 60 years of age. Pneumococci are the second most 
frequent cause (after Haemophilus influenzae type b) of bacterial meningitis and 
otitis media in children. With die recent introduction of conjugate vaccines for H, 
influenzae type b, pneumococcal meningitis is likely to become increasingly 
prominent. S. pneumoniae is the most important etiologic agent of conununity- 



wo 98/18931 



2 



PCTAJS97/19588 



acquired pneumonia in adults and is the second most common cause of bacterial 
meningitis behind Neisseria meningitidis. 

The antibiotic generally prescribed to treat S, pneumoniae is 
benzylpenicillin, although resistance to this and to other antibiotics is found 

5 occasionally. Pneumococcal resistance to penicillin results from mutations in its 
penicillin-binding proteins. In uncomplicated pneumococcal pneumonia caused by 
a sensitive strain, treatment with penicillin is usually successful unless started too 
late. Erythromycin or clindamycin can be used to treat pneumonia in patients 
hypersensitive to penicillin, but resistant strains to these drugs exist. Broad 

10 spectrum antibiotics (e.g., the teoracyclines) may also be effective, although 
tetracycline-resistant strains are not rare. In spite of the availability of antibiotics, 
the mortality of pneumococcal bacteremia in the last four decades has remained 
stable between 25 and 29%. (Gillespie, S.H., et aL, / Med. Microbiol. 28:237- 
248 (1989). 

15 5. pneumoniae is carried in the upper respiratory tract by many healthy 

individuals. It has been suggested that attachment of pneumococci is mediated by a 
disaccharide receptor on fibronectin, present on human pharyngeal epithelial cells. 
(Anderson, B.J., et al, J. Immunol 742:2464-2468 (1989). The mechanisms by 
which pneumococci translocate from the nasopharynx to the lung, thereby causing 

20 pneumonia, or migrate to the blood, giving rise to bacteremia or septicemia, are 
poorly understood. (Johnston, R.B., et aL Rev. Infect. Dis. /J(Suppl. 6):S509- 
517(1991). 

Various proteins have been suggested to be involved in the pathogenicity of 
S. pneumoniae, however, only a few of them have actually been confirmed as 

25 virulence factors. Pneumococci produce an IgAl protease that might interfere with 
host defense at mucosal surfaces. (Komfield, S.J., et a/.. Rev. Inf. Dis. 3:521- 
534 (1981). S. pneumoniae also produces neuraminidase, an enzyme that may 
facilitate attachment to epithelial cells by cleaving sialic acid from the host 
glycolipids and gangliosides. Partially purified neuraminidase was observed to 

30 induce meningitis-like symptoms in mice; however, the reliability of this finding 
has been questioned because the neuraminidase preparations used were probably 
contaminated with cell wall products. Other pneumococcal proteins besides 
neuraminidase are involved in the adhesion of pneumococci to epithelial and 
endothelial ceils. These pneumococcal proteins have as yet not been identified. 

35 Recently, Cundell et. a/., reported that peptide permeases can modulate 
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pneumococcal adherence to epithelial and endothelial cells. It was, however, 
unclear whether these permeases function directly as adhesions or whether they 
enhance adherence by modulating the expression of pneumococcal adhesions. 
(DeVelasco, E.A.. et al. Micro, Rev. 59:59 1 -603 ( 1 995). A better understanding 
5 of the virulence factors determining its pathogenicity will need to be developed to 
cope with the devastating effects of pneumococcal disease in humans. 

Ironically, despite the prominent role of S. pneumoniae in the discovery of 
DNA, litde is known about the molecular genetics of the organism. The 5. 
pneumoniae genome consists of one circular, covalently closed, double-stranded 

10 DNA and a collection of so-called variable accessory elements, such as prophages, 
plasrnids, transposons and the like. Most physical characteristics and almost all of 
the genes of 5. pneumoniae are unknown. Among the few that have been 
identified, most have not been physically mapped or characterized in detail. Only a 
few genes of this organism have been sequenced. (See, for instance current 

15 versions of GENBANK and other nucleic acid databases, and references that relate 
to the genome of 5. pneumoniae such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by 5. 
pneumoniae, infection involves the progranmied expression of 5. pneumoniae 
genes, and that characterizing the genes and their patterns of expression would add 

20 dramatically to our understanding of the organism and its host interactions. 
Knowledge of 5. pneumoniae genes and genomic organization would improve our 
understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, 
characterized genes and genomic fragments of 5. pneumoniae would provide 

25 reagents for, among other things, detecting, characterizing and controlling 5. 
pneumoniae infections. There is a need to characterize the genome of S, 
pneumoniae and for polynucleotides of this organism. 
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SUMMARY OF THE INVENTION 

The present invention is based on the sequencing of fragments of the 
5 Streptococcus pneumoniae genome. The primary nucleotide sequences which were 
generated are provided in SEQ ID NOS: 1 -39 1 . 

The present invention provides the nucleotide sequence of several hundred 
contigs of the Streptococcus pneumoniae genome, which are listed in tables below 
and set out in the Sequence Listing submitted herewith, and representative 

10 fragments thereof, in a form which can be readily used, analyzed, and interpreted 
by a skilled artisan. In one embodiment, the present invention is provided as 
contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS: 1 -39 1 . 

The present invention further provides nucleotide sequences which are at 

1 5 least 95% identical to the nucleotide sequences of SEQ ID NOS: 1 -39 1 . 

The nucleotide sequence of SEQ ID NOS: 1-391, a representative fragment 
thereof, or a nucleotide sequence which is at least 95% identical to the nucleotide 
sequence of SEQ ID NOS: 1-391 may be provided in a variety of mediums to 
facilitate its use. In one j^plication of this embodiment, the sequences of the 

20 present invention are recorded on computer readable media. Such media includes, 
but is not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; 
electrical storage media such as RAM and ROM; and hybrids of tliese categories 
such as magnetic/optical storage media. 

25 The present invention further provides systems, particularly computer- 

based systems which contain the sequence information herein described stored in a 
data storage means. Such systems are designed to identify commercially important 
fragments of the Streptococcus pneumoniae genome. 

Another embodunent of the present invention is directed to fragments of the 

30 Streptococcus pneumoniae genome having particular structural or functional 
attributes. Such fragments of the Streptococcus pneumoniae genome of the present 
invention include, but are not limited to, fragments which encode peptides, 
hereinafter referred to as open reading frames or ORFs, fragments which modulate 
the expression of an operably linked ORF, hereinafter referred to as expression 

35 modulating fragments or EMFs, and fragments which can be used to diagnose the 
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presence of Streptococcus pneumoniae in a sample, hereinafter referred to as 
diagnostic fragments or DFs. 

Each of the ORFs in fragments of the Streptococcus pneumoniae genome 
disclosed in Tables 1-3, and the EMFs found 5* to the ORFs, can be used in 
5 numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the 
presence of a specific microbe in a sample, to selectively control gene expression in 
a host and in the production of polypeptides, such as polypeptides encoded by 
ORFs of the present invention, particular those polypeptides that have a 
10 pharmacological activity. 

The present invention further includes recombinant constructs comprising 
one or more fragments of the Streptococcus pneumoniae genome of the present 
invention. The recombinant constructs of the present invention comprise vectors, 
such as a plasmid or viral vector, into which a fragment of the Streptococcus 
1 5 pneumoniae has been inserted. 

The present invention further provides host cells containmg any of the 
isolated fragments of the Streptococcus pneumoniae genome of the present 
invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a 
20 bacterial cell 

The present invention is further directed to isolated polypeptides and 
proteins encoded by ORFs of the present invention. A variety of methods, well 
known to those of skill in the art, routinely may be utilized to obtain any of the 
polypeptides and proteins of the present invention. For instance, polypeptides and 

25 proteins of the present invention having relatively short, simple amino acid 
sequences readily can be synthesized using commercially available automated 
peptide synthesizers. Polypeptides and proteins of the present invention also may 
be purified from bacterial cells which naturally produce the protein. Yet another 
alternative is to purify polypeptide and proteins of the present invention from cells 

30 which have been altered to express them. 

The invention further provides methods of obtaining homologs of the 
fragments of the Streptococcus pneumoniae genome of the present invention and 
homologs of the proteins encoded by the ORFs of the present invention. 
Specifically, by using the nucleotide and amino acid sequences disclosed herein as 
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a probe or as primers, and techniques such as PCR cloning and colony/plaque 
hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind 
polypeptides and proteins of the present invention. Such antibodies include both 
5 monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above- 
described antibodies. A hybridoma is an immortalized cell line which is capable of 
secreting a specific monoclonal antibody. 

The present invention further provides methods of identifying test samples 
10 derived from cells which express one of the ORFs of the present invention, or a 
homolog thereof. Such methods comprise incubating a test sample with one or 
more of the antibodies of the present invention, or one or more of the DFs of the 
present invention, under conditions which allow a skilled artisan to determine if the 
sample contains the ORF or product produced therefrom. 
15 In another embodiment of the present invention, kits are provided which 

contain the necessary reagents to carry out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in 
close confinement, one or more containers which comprises: (a) a first container 
comprising one of the antibodies, or one of the DFs of the present invention; and 
20 (b) one or more other containers comprismg one or more of the following: wash 
reagents, reagents capable of detecting presence of bound antibodies or hybridized 
DFs. 

Using the isolated proteins of the present invention, the present invention 
further provides methods of obtaining and identifying agents capable of binding to 

25 a polypeptide or protein encoded by one of the ORFs of the present invention. 
Specifically, such agents include, as furtiier described below, antibodies, peptides, 
carbohydrates, pharmaceutical agents and the like. Such methods comprise steps 
of: (a) contacting an agent with an isolated protein encoded by one of the ORFs of 
the present invention; and (b) determining whether the agent binds to said protein. 

30 The present genomic sequences of Streptococcus pneumoniae will be of 

great value to all laboratories working with this organism and for a variety of 
commercial purposes. Many fragments of the Streptococcus pneumoniae genome 
will be immediately identified by similarity searches against GenBank or protein 
databases and will be of inuncdiate value to Streptococcus pneumoniae researchers 
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and for immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology for elucidating extensive genomic 
sequences of bacterial and other genomes has and will greatly enhance the ability to 
5 analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis 
of chromosome sttucture and function, including the ability to identify genes within 
large segments of genomic DNA, the structure, position, and spacing of regulatory 
elements, the identification- of genes with potential industrial applications, and the 
10 ability to do comparative genomic and molecular phytogeny. 

DESCRIPTION O F THE FIGURES 

FIGURE 1 is a block diagram of a computer system (102) that can be 
15 used to implement computer-based systems of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer 
programs used to collect, assemble, edit and annotate the contigs of the 
Streptococcus pneumoniae genome of the present invention. Both Macintosh and 

20 Unix platforms are used to handle the AB 373 and 377 sequence data files, largely 
as described in Kerlavage et ai. Proceedings of the Twenty-Sixth Annual Hawaii 
International Conference on System Sciences, 585, IEEE Computer Society Press, 
Washington D.C. (1993). Factura (AB) is a Macintosh program designed for 
automatic vector sequence removal and end-trimming of sequence files. The 

25 program Loadis nins on a Macintosh platform and parses the feature data extracted 
from the sequence files by Factura to the Unix based Streptococcus pneumoniae 
relational database. Assembly of contigs (and whole genome sequences) is 
accomplished by retrieving a specific set of sequence files and their associated 
features using Extrseq, a Unix utility for retrieving sequences from an SQL 

30 database. The resulting sequence file is processed by seq^filter to trim portions of 
the sequences with more than 2% ambiguous nucleotides. The sequence files were 
assembled using TIGR Assembler, an assembly engine designed at The Institute 
for Genomic Research ( TIGR ) for rapid and accurate assembly of thousands of 
sequence fragments. The collection of contigs generated by the assembly step is 

35 loaded into the database with the lassie program. Identification of open reading 
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frames (ORFs) is accomplished by processing contigs with zorf or GenMark. The 
ORFs are searched against S, pneumoniae sequences from GenBank and against all 
protein sequences using the BLASTN and BLASTP programs, described in 
Altschul e( ai, J, Mol Biol. 215: 403-410 (1990)). Results of the ORF 
5 determination and similarity searching steps were loaded into the database. As 
described below, some results of the determination and the searches arc set out in 
Tables 1-3. 

DETAILED DESCRIPTION OF IL LUSTRATIVE EMBODIMKNTS 

10 

The present invention is based on the sequencing of fragments of the 
Streptococcus pneumoniae genome and analysis of the sequences. The primary 
nucleotide sequences generated by sequencing the fragments are provided in SEQ 
ID NOS: 1-391. (As used herein, the "primary sequence" refers to the nucleotide 

15 sequence represented by the lUPAC nomenclature system. ) 

In addition to the aforementioned Streptococcus pneumoniae polynucleotide 
and polynucleotide sequences, the present invention provides the nucleotide 
sequences of SEQ ID NOS: 1-391. or representative fragments thereof, in a form 
which can be readily used, ahalyzed, and interpreted by a skilled artisan. 

20 As used herein, a "representative fragment of the nucleotide sequence 

depicted in SEQ ID NOS: 1-391" refers to any portion of the SEQ ID NOS: 1-391 
which is not presently represented within a publicly available database. Preferred 
representative fragments of the present invention are Streptococcus pneumoniae 
open reading frames ( ORFs ), expression modulating fragment ( EMFs ) and 

25 fragments which can be used to diagnose the presence of Streptococcus 
pneumoniae in sample ( DFs ): A non-limiting identification of preferred 
representative fragments is provided in Tables 1-3. As discussed in detail below, 
the information provided in SEQ ID NOS: 1-391 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled 

30 in the art to clone and sequence all "representative fragments" of interest, including 
open reading frames encoding a large variety of Streptococcus pneumoniae 
proteins. 

While the presently disclosed sequences of SEQ ID NOS: 1-391 are highly 
accurate, sequencing techniques are not perfect and, in relatively rare instances, 
35 further investigation of a fragment or sequence of the invention may reveal a 
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nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID 
NOS:l-391. However, once the present invention is made available (i.e., once the 
information in SEQ ID NOS: 1-391 and Tables 1-3 has been made available), 
resolving a rare sequencing error in SEQ ID NOS: 1-391 will be well within the 
5 skill of the art. The present disclosure makes available sufficient sequence 
information to allow any of the described conligs or portions thereof to be obtained 
readily by straightforward application of routine techniques. Further sequencing of 
such polynucleotide may proceed in like manner using manual and automated 
sequencing methods which are employed ubiquitous in the art. Nucleotide 

10 sequence editing software is publicly available. For example, Applied Biosystem's 
(AB) AutoAssembler can be used as an aid during visual inspection of nucleotide 
sequences. By employing such routine techniques potential errors readily may be 
identified and the correct sequence then may be ascertained by targeting further 
sequencing effort, also of a routine nature, to the region containing the potential 

15 error. 

Even if all of the very rare sequencing errors in SEQ ID NOS:l-391 were 
conected, the resulting nucleotide sequences would still be at least 95% identical, 
nearly all would be at least 99% identical, and the great majority would be at least 
99.9% identical to the nucleotide sequences of SEQ ID NOS: 1-391. 

20 As discussed elsewhere herein, polynucleotides of the present invention 

readily may be obtained by routine application of well known and standard 
procedures for cloning and sequencing DNA. Detailed methods for obtaining 
libraries and for sequencing are provided below, for instance. A wide variety of 
Streptococcus pneumoniae strains that can be used to prepare 5. pneumoniae 

25 genomic DNA for cloning and for obtaining polynucleotides of the present 
invention are available to the public from recognized depository institutions, such 
as tile American Type Culture Collection ( ATCC ). While die present invention is 
enabled by the sequences and other information herein disclosed, the 5. 
pneumoniae strain that provided the DNA of the present Sequence Listing, Strain 

30 7/87 14.8.91, has been deposited in the ATCC, as a convenience to those of skill 
in the art. As a further convenience, a library of 5. pneumoniae genomic DNA, 
derived from the same strain, also has been deposited in die ATCC. The S. 
pneumoniae strain was deposited on October 10. 1996, and was given Deposit No. 
55840, and the cDNA library was deposited on October 11, 1996 and was given 

35 Deposit No. 97755. The genomic fragments in the library are 15 to 20 kb 
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fragments generated by partial Sau3Al digestion and they are inserted into the 
BamHI site in the well-known lambda-derived vector lambda DASH II (Stratagene. 
La Jolla, CA). The provision of the deposits is not a waiver of any rights of the 
inventors or their assignees in the present subject matter. 

5 The nucleotide sequences of the genomes from different strains of 

Streptococcus pneumoniae differ somewhat. However, the nucleotide sequences 
of the genomes of all Streptococcus pneumoniae strains will be at least 95% 
identical, in corresponding part, to the nucleotide sequences provided in SEQ ID 
NOS: 1-391. Nearly all will be at least 99% identical and the great majority will be 

10 99.9% identical. 

Thus, the present invention further provides nucleotide sequences which 
are at least 95%, preferably 99% and most preferably 99.9% identical to the 
nucleotide sequences of SEQ ID NOS: 1-391, in a form which can be readily used, 
analyzed and interpreted by the skilled artisan. 

IS Methods for detennining whether a nucleotide sequence is at least 95%, at 

least 99% or at least 99.9% identical to the nucleotide sequences of SEQ ID 
NOS: 1-391 are routine and readily available to the skilled artisan. For example, the 
well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad. 
Sci. USA 85: 2444 (1988) can be useid to generate the percent identity of nucleotide 

20 sequences. The BLASTN program also can be used to generate an identity score 
of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS: 1 -39 1 , a representative 
25 fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% 
and most preferably at least 99.9% identical to a polynucleotide sequence of SEQ 
ID NOS: 1-391 may be "provided" in a variety of mediums to facilitate use thereof. 
As used herein, provided refers to a manufacture, other than an isolated nucleic 
acid molecule, which contains a nucleotide sequence of the present invention; i.e., 
30 a nucleotide sequence provided in SEQ ID NOS: 1-391, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most 
preferably at least 99.9% identical to a polynucleotide of SEQ ID NOS: 1-391. 
Such a manufacture provides a large portion of the Streptococcus pneumoniae 
genome and parts thereof (e.g., a Streptococcus pneumoniae open reading frame 
35 (ORF)) in a form which allows a skilled artisan to examine the manufacture using 
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means not directly applicable to examining the Streptococcus pneumoniae genome 
or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 
5 readable media" refers to any medium which can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, 
such as floppy discs, hard disc storage medium, and magnetic tape; optical storage 
media such as CD- ROM; electrical storage media such as RAM and ROM; and 
hybrids of these categories, such as magnetic/optical storage media. A skilled 

10 artisan can readily appreciate how any of the presently known computer readable 
mediums can be used to create a manufacture comprising computer readable 
medium having recorded thereon a nucleotide sequence of the present invention. 
Likewise, it will be clear to those of skill how additional computer readable media 
that may be developed also can be used to create analogous manufactures having 

15 recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently 
know methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present 

20 invention: A variety of data storage structures are available to a skilled artisan 
for creating a computer readable medium having recorded thereon a nucleotide 
sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In 
addition, a variety of data.processor programs and formats can be used to store the 

25 nucleotide sequence information of the present invention on computer readable 
medium. The sequence information can be represented in a word processing text 
file, formatted in conunercially- available software such as WordPerfect and 
Microsoft Word, or represented in the form of an ASCII file, stored in a database 
application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily 

30 adapt any number of data-processor structuring formats {e.g,, text file or database) 
in order to obtain computer readable medium having recorded thereon the 
nucleotide sequence information of the present invention. 

Computer software is publicly available which allows a skilled artisan to 
access sequence information provided in a computer readable medium. Thus, by 

35 providing in computer rieadable form the nucleotide sequences of SEQ ID NOS:l- 
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391, a representative fragment thereof, or a nucleotide sequence at least 95%. 
preferably at least 99% and most preferably at least 99.9% identical to a sequence 
of SEQ ID NOS: 1-391 the present invention enables the skilled artisan routinely to 
access the provided sequence information for a wide variety of purposes. 
5 The examples which follow demonstrate how software which implements 

the BLAST (Altschul et ai, J, MoL Biol 275:403-410 (1990)) and BLAZE 
(Brutlag e/ aL, Camp. Chem. 77:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Streptococcus 
pneumoniae genome which contain homology to ORFs or proteins from both 

10 Streptococcus pneumoniae and from other organisms. Among the ORFs discussed 
herein are protein encoding fragments of the Streptococcus pneumoniae genome 
useful in producing commercially important proteins, such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

The present invention further provides systems, particularly computer- 

15 based systems, which contain die sequence information described herein. Such 
systems are designed to identify, among other things, conmiercially important 
fragments of the Streptococcus pneumoniae genome. 

As used herein, "a computer-based system" refers to the hardware means, 
software means, and data storage means used to analyze the nucleotide sequence 

20 information of the present invention. The minimum hardware means of the 
computer-based systems of the present invention comprises a central processing 
unit (CPU), input means, output means, and data storage means. A skilled artisan 
can readily appreciate that any one of the currently available computer-based 
systems are suitable for use in the present invention. 

25 As stated above, the computer-based systems of the present invention 

comprise a data storage means having stored dierein a nucleotide sequence of the 
present invention and the necessary hardware means and software means for 
supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store 

30 nucleotide sequence information of the present invention, or a memory access 
means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target 

35 structural motif with the sequence information stored within the data storage 
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means. Search means are used to identify fragments or regions of the present 
genomic sequences which match a particular target sequence or target motif. A 
variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the 
5 computer-based systems of the present invention. Examples of such software 
includes, but is not limited to, MacPattem (EMBL), BLASTN and BLASTX 
(NCBIA). A skilled artisan can readily recognize that any one of the available 
algorithms or implementing software packages for conducting homology searches 
can be adapted for use in the present computer-based systems. 

10 As used herein, a "target sequence" can be any DNA or amino acid 

sequence of six or more nucleotides or two or more amino acids. A skilled artisan 
can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence in the database. The most 
preferred sequence length of a target sequence is from about 10 to 100 amino acids 

15 or from about 30 to 300 nucleotide residues. However, it is well recognized that 
searches for commercially important fragments, such as sequence fragments 
involved in gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) 

20 are chosen based on a three-dimensional configuration which is formed upon the 
folding of the target motif. There are a variety of target motifs known in the art. 
Protein target motifs include, but are not limited to, enzymic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to, promoter 
sequences, hairpin structures and inducible expression elements (protein binding 

25 sequences). 

A variety of structural formats for the input and output means can be used 
to input and output the information in the computer-based systems of the present 
invention. A preferred format for an output means ranks fragments of the 
Streptococcus pneumoniae genomic sequences possessing varying degrees of 

30 homology to the target sequence or target motif. Such presentation provides a 
skilled artisan with a ranking of sequences which contain various amounts of the 
target sequence or target motif and identifies the degree of homology contained in 
the identified fragment. 

A variety of comparing means can be used to compare a target sequence or 

35 target motif with the data storage means to identify sequence fragments of the 
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Streptococcus pneumoniae genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in 
Altschul et a/., / MoL Biol. 215: 403-410 (1990), is used to identify open reading 
frames within the Streptococcus pneumoniae genome. A slcilled artisan can readily 
5 recognize that any one of the publicly available homology search programs can be 
used as the search means for the computer-based systems of the present invention. 
Of course, suitable proprietary systems that may be known to those of skill also 
may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of 

10 embodiments of this aspect of present invention. The computer system 102 
includes a processor 106 connected to a bus 104. Also connected to the bus 104 
are a main memory 108 (preferably implemented as random access memory, RAM) 
and a variety of secondary storage devices 1 10, such as a hard drive 1 12 and a 
removable medium storage device 1 14. The removable medium storage device 1 14 

15 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape 
drive, etc, A removable storage medium 1 16 (such as a floppy disk, a compact 
disk, a magnetic tape, etc.) containing control logic and/or data recorded therein 
may be inserted into the removable medium storage device 1 14. The computer 
system 102 includes appropriate software for reading the control logic and/or the 

20 data from the removable medium storage device 1 14, once it is iaserted into the 
removable medium storage device 1 14. 

A nucleotide sequence of the present invention may be stored in a well 
known manner in the main memory 108, any of the secondary storage devices 1 10, 
and/or a removable storage medium 116. During execution, software for accessing 

25 and processing the genomic sequence (such as search tools, comparing tools, etc) 
reside in main memoiy 108, in accordance with the requirements and operating 
parameters of the operating system, the hardware system and the software program 
or programs. 
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BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to isolated 
fragments of the Streptococcus pneumoniae genome. The fragments of the 
5 Streptococcus pneumoniae genome of the present invention include, but are not 
limited to fragments which encode peptides and polypeptides, hereinafter open 
reading frames (ORFs), fragments which modulate the expression of an operably 
hnked ORF, hereinafter expression modulating fragments (EMFs) and fragments 
which can be used to diagnose the presence of Streptococcus pneumoniae in a 

10 sample, hereinafter diagnostic fragments (DFs). 

As used herein, an "isolated nucleic acid molecule'* or an "isolated fragment 
of the Streptococcus pneumoniae genome" refers to a nucleic acid molecule 
possessing a specific nucleotide sequence which has been subjected to purification 
means to reduce, from the composition, the number of compounds which are 

15 normally associated with the composition. Particularly, the term refers to the 
nucleic acid molecules having the sequences set out in SEQ ID NOS: 1-391, to 
representative fragments thereof as described above, to polynucleotides at least 
95%, preferably at least 99% and especially preferably at least 99.9% identical in 
sequence thereto, also as set out above. 

20 A variety of purification means can be used to generate the isolated 

fragments of the present invention. These include, but are not limited to methods 
which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment. Streptococcus pneumoniae DNA can be enzymatically 
sheared to produce fragments of 15-20 kb in length. These fragments can then be 

25 used to generate a Streptococcus pneumoniae library by inserting them into lambda 
clones as described in the Examples below. Primers flanking, for example, an 
ORF, such as those enumerated in Tables 1-3 can then be generated using 
nucleotide sequence information provided in SEQ ID NOS: 1-391. Well known 
and routine techniques of PCR cloning then can be used to isolate the ORF from 

30 the lambda DNA library or Streptococcus pneumoniae genomic DNA. Thus, given 
the availability of SEQ ID NOS: 1-391, the information in Tables 1, 2 and 3, and 
the information that may be obtained readily by analysis of the sequences of SEQ 
ID NOS: 1-391 using methods set out above, those of skill will be enabled by the 
present disclosure to isolate any ORF-containing or other nucleic acid fragment of 

35 the present invention. 
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The isolated nucleic acid molecules of the present invention include, but are 
not limited to single stranded and double stranded DNA, and single stranded RN A. 

As used herein, an "open reading frame," ORF, means a series of triplets 
coding for amino acids without any termination codons and is a sequence 
5 translatable into protein. 

Tables i, 2, and 3 list ORFs in the Streptococcus pneumoniae genomic 
contigs of the present invention tfiat were identified as putative coding regions by 
the GeneMark software using organism-specific second-order Markov probability 
transition maurices. It will be appreciated thai other criteria can be used, in 
10 accordance with well known analytical methods, such as those discussed herein, to 
generate more inclusive, more restrictive, or more selective lists. 

Table I sets out ORFs in the Streptococcus pneumoniae contigs of the 
present invention that over a continuous region of at least 50 bases are 95% or 
more identical (by BLAST analysis) to a nucleotide sequence available through 
15 GenBank in October, 1997. 

Table 2 sets out ORFs in the Streptococcus pneumoniae contigs of the 
present invention that are not in Table I and match, with a BLASTP probability 
score of 0.0 1 or less, a polypeptide sequence available through GenBank in 
October, 1997. 

20 Table 3 sets out ORFs in the Streptococcus pneumoniae contigs of the 

present invention that do not match significantly, by BLASTP analysis, a 
polypeptide sequence available through GenBank in October, 1997. 

In each table, the first and second columns identify the ORF by, 
respectively, contig number and ORF number within the contig; the third column 

25 indicates the first nucleotide of the ORF (actually the first nucleotide of the stop 
codon immediately preceeding die ORF), counting from the 5' end of the contig 
strand; and the fourth column, "stop (nt)** indicates the last nucleotide of the stop 
codon defining the 3'end of the ORF. 

In Tables 1 and 2, column five, lists the Reference for the closest 

30 matching sequence available through GenBank. These reference numbers are the 
databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the nomenclature are available 
from the National Center for Biotechnology Information. Column six in Tables 1 
and 2 provides the gene name of the matching sequence; column seven provides 

35 the BLAST identity score and column eight the BLAST similarity score from the 
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comparison of the ORF and the homologous gene; and column nine indicates the 
length in nucleotides of the highest scoring segment pair identified by the BLAST 
identity analysis. 

Each ORF described in the tables is defined by "start (nt)" (5') and "stop 
5 (nt)" (3') nucleotide position numbers. These position numbers refer to the 
boundaries of each ORF and provide orientation with respect to whether the 
forward or reverse strand is the coding strand and which reading frame the coding 
sequence is contained. The "start" position is the first nucleotide of the triplet 
encoding a stop codon just 5' to the ORF and the "stop" position is the last 

10 nucleotide of the triplet encoding the next in-frame stop codon (i.e., the stop codon 
at the 3" end of the ORF). Those of ordinary skill in the art appreciate that 
prefeixed fragments within each ORF described in the table include fragments of 
each ORF which include the entire sequence from the delineated "start" and "stop" 
positions excepting the first and last three nucleotides since these encode stop 

15 codons. Thus, polynucleotides set out as ORFs in the tables but lacking the three 
(3) 5* nucleotides and the three (3) 3* nucleotides are encompassed by the present 
invention. Those of skill also appreciate that particularly preferred are fragments 
within each ORF that are polynucleotide fragments comprising polypeptide coding 
sequence. As defined herein, "coding sequence" includes the fragment within an 

20 ORF beginning at the first in-frame ATG (triplet encoding methionine) and ending 
with the last nucleotide prior to the triplet encoding the 3' stop codon. Preferred 
arc Augments comprising the entire coding sequence and fragments comprising the 
entire coding sequence, excepting the coding sequence for the N-terminal 
methionine. Those of skill s^preciate that the N-terminal methionine is often 

25 removed during post-translational processing and that polynucleotides lacking the 
ATG can be used to facilitate production of N-termainal fusion proteins which may 
be benefical in the production or use of genetically engineered proteins. Of course, 
due to the degeneracy of the genetic code many polynucleotides can encode a given 
polypeptide. Thus, the invention ftirther includes polynucleotides comprising a 

30 nucleotide sequence encoding a polypeptide sequence itself encoded by the coding 
sequence within an ORF described in Tables 1-3 herein. Further, polynucleotides 
at least 95%, preferably at least 99% and especially preferably at least 99.9% 
identical in sequence to the foregoing polynucleotides, are contemplated by the 
present invention. 
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Polypeptides encoded by polynucleotides described above and elsewhere 
herein are also provided by the present invention as are polypeptide comprising a 
an amino acid sequence at least about 95%, preferably at least 97% and even more 
preferably 99% Identical to the amino acid sequence of a polypeptide encoded by an 
5 ORF shown in Tables 1-3. These polypeptides may or may not comprise an N- 
terminal methionine. 

The concepts of percent identity and percent similarity of two polypeptide 
sequences is well understood in the art. For example, two polypeptides 10 amino 
acids in length which differ at three amino acid positions (e.g., at positions 1, 3 

10 and 5) are said to have a percent identity of 70%. However, the same two 
polypeptides would be deemed to have a percent similarity of 80% if, for example 
at position 5, the amino acids moieties, although not identical, were "similar" {Le., 
possessed similar biochemical characteristics). Many programs for analysis of 
nucleotide or amino acid sequence similarity, such as fasta and BLAST specifically 

15 list percent identity of a matching region as an output parameter. Thus, for 
instance. Tables 1 and 2 herein enumerate the percent identity of the highest 
scoring segment pair in each ORF and its listed relative. Further details 
concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations 

20 provided below. 

It will be appreciated that other criteria can be used to generate more 
inclusive and more exclusive listings of the types set out in the tables. As those of 
skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 
artisan can readily identify ORFs in contigs of the Streptococcus pneumoniae 

25 genome other than those listed in Tables 1-3. such as ORFs which are overlapping 
or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using die computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a 
series of nucleotide molecules which modulates the expression of an operably 

30 linked ORF or EMF. 
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As used herein, a sequence is said to "modulate the expression of an 
operably linked sequence" when the expression of the sequence is altered by the 
presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs arc 
5 fragments which induce the expression or an operably linked ORF in response to a 
specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Streptococcus 
pneumoniae genome by their proximity to the ORFs provided in Tables 1-3. An 
intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 

10 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate 
the expression of an operably linked ORF in a fashion similar to that found with the 
naturally Ivnked ORF sequence. As used herein, an "intergenic segment" refers to 
fragments of the Streptococcus pneumoniae genome which are between two 
ORF(s) herein described. EMFs also can be identified using known EMFs as a 

15 target sequence or target motif in the computer-based systems of the present 
invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap 
vector. An EMF trap vector contains a cloning site linked to a marker sequence. A 
marker sequence encodes an identifiable phenotype, such as antibiotic resistance or 

20 a complementing nutrition auxotrophic factor, which can be identified or assayed 
when the EMF trap vector is placed within an appropriate host under appropriate 
conditions. As described above, a EMF will modulate the expression of an 
operably linked marker sequence. A more detailed discussion of various marker 
sequences is provided below. A sequence which is suspected as being an EMF is 

25 cloned in all three reading frames in one or more restriction sites upstream from the 
marker sequence in the EMF trap vector. The vector is then transformed into an 
appropriate host using known procedures and the phenotype of the transformed 
host in examined under appropriate conditions. As described above, an EMF will 
modulate the expression of an operably linked marker sequence. 

30 As used herein, a "diagnostic fragment," DF, means a series of nucleotide 

molecules which selectively hybridize to Streptococcus pneumoniae sequences. 
DFs can be readily identified by identifying unique sequences within contigs of the 
Streptococcus pneumoniae genome, such as by using well-known computer 
analysis software, and by generating and testing probes or amplification primers 
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consisting of the DF sequence in an appropriate diagnostic formal which 
determines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not 
limited to the specific sequences herein described, but also include allelic and 
5 species variations thereof. Allelic and species variations can be routinely 
determined by comparing the sequences provided in SEQ ID NOS: 1-391. a 
representative fragment thereof, or a nucleotide sequence at least 95%, preferrably 
at least 99% and most at least preferably 99.9% identical to SEQ ID NOS: 1-391, 
with a sequence from another isolate of the same species. Furthermore, to 

10 accommodate codon variability, the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed 
herein. In other words, in the coding region of an ORF, substitution of one codon 
for another which encodes the same amino acid is expressly contemplated. Any 
specific sequence disclosed herein can be readily screened for errors by 

15 resequencing a particular fragment, such as an ORF. in both directions {Le,, 
sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Streptococcus pneumoniae origin 
isolated by using part or all of the fragments in question as a probe or primer. 

Preferred DFs of the present invention comprise at least about 17, 

20 preferrably at least about 20, and more preferrably at least about 50 contiguous 
nucleotides within an ORF set out in Tables 1-3. Most highly preferred DFs 
specifically hybridize to a polynucleotide containing the sequence of the ORF from 
which they are derived. Specific hybridization occurs even under stringent 
conditions defined elsewhere herein. 

25 Each of the ORFs of the Streptococcus pneumoniae genome disclosed in 

Tables 1, 2 and 3, and the EMFs found 5' to the ORFs, can be used as 
polynucleotide reagents in numerous ways. For example, the sequences can be 
used as diagnostic probes or diagnostic amplification primers to detect the presence 
of a specific microbe in a sample, particularly Streptococcus pneumoniae. 

30 Especially preferred in this regard are ORFs such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most 
likely to be highly selective for Streptococcus pneumoniae. Also particularly 
preferred are ORFs that can be used to distinguish between strains of Streptococcus 
pneumoniae, particularly those that distinguish medically important strain, such as 

35 drug-resistant strains. 
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In addition, the fragments of the present invention, as broadly described, 
can be used to control gene expression through triple helix formation or antisense 
DNA or RNA, both of which methods are based on the binding of a polynucleotide 
sequence to DNA or RNA. Triple helix-formation optimally results in a shut-off of 
5 RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix- 
forming oligonucleotides. Polynucleotides suitable for use in these methods are 
usually 20 to 40 bases in length and are designed to be complementary to a region 

10 of the gene involved in transcription, for triple-helix formation, or to the mRNA 
itself, for antisense inhibition. Both techniques have been demonstrated to be 
effective in model systems, and the requisite techniques are well known and 
involve routine procedures. Triple helix techniques are discussed in, for example, 
Lee et ai, NucL Acids Res. 6:3073 (1979); Cooney et al. Science 241:456 

15 (1988); and Dervan et ai. Science 257:1360 (1991). Antisense techniques in 
general are discussed in, for instance, Okano, / Neurochem. 56:560 (1991) and 
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL( 1988)). 

The present invention further provides recombinant constructs comprising 

20 one or more fragments of the Streptococcus pneumoniae genomic fragments and 
contigs of the present invention. Certain preferred recombinant constructs of the 
present invention comprise a vector, such as a plasmid or viral vector, into which a 
fragment of the Streptococcus pneumoniae genome has been inserted, in a forward 
or reverse orientation. In the case of a vector comprising one of the ORFs of the 

25 present invention, the vector may further comprise regulatory sequences, including 
for example, a promoter, operably linked to the ORF. For vectors comprising the 
EMFs of the present invention^ the vector may further comprise a marker sequence 
or heterologous ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of 

30 skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 
example. Useful bacterial vectors include phagescript, PsiX174, pBluescript SK, 
pBS KS. pNH8a, pNH16a, pNH18a, pNH46a (available from Stratagene); 
pTrc99A, pKK223-3. pKK233-3, pDR540, pRIT5 (available from Pharmacia). 

35 Useful eukaryolic vectors include pWLneo, pSV2cat, pOG44, pXTl, pSG 
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(available from Stratagene) pSVK3, pBPV, pMSG, pSVL (available from 
Pharmacia). 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. 
5 Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial 
promoters include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic 
promoters include CMV immediate early. HSV thymidine kinase, early and late 
SV40, LTRs from retrovirus, and mouse metallothionein- L Selection of the 
appropriate vector and promoter is well within the level of ordinary skill in the art. 

10 The present invention further provides host cells containing any one of the 

isolated fragments of the Streptococcus pneumoniae genomic fragments and 
coniigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaryotic host 
cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or 

1 5 a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct 
comprising an ORF of the present invention, may be introduced into the host by a 
variety of well established techniques that are standard in the an, such as calcium 
phosphate transfection, DEAE, dexu-an mediated transfection and electroporation, 

20 which are described in, for instance, Davis, L. et ai, BASIC METHODS IN 
MOLECULAR BIOLOGY (1986). 

A host cell containing one of the fragments of the Streptococcus 
pneumoniae genomic fragments and contigs of the present invention, can be used 
in conventional manners to produce the gene product encoded by the isolated 

25 fragment (in the case of an ORF) or can be used to produce a heterologous protein 
under the control of the EMR The present invention further provides 

isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present 
invention. By "degenerate variant" is intended nucleotide fragments which differ 

30 from a nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide 
sequence but, due to the degeneracy of the Genetic Code, encode an identical 
polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs and 
subfragmenis thereof depicted in Tables 2 and 3 which encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any 
one of the isolated polypeptides or proteins of the present invention. At the 
simplest level, the annino acid sequence can be synthesized using commercially 
available peptide synthesizers. This is particularly useful in producing small 
5 peptides and fragments of larger polypeptides. Such short fragments as may be 
obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from 
bacterial cells which naturally produce the polypeptide or protein. One skilled in 

10 the art can readily employ well-known methods for isolating polypeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention 
produced naturally by a bacterial strain, or by other methods. Methods for 
isolation and purification that can be employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion- 

15 exchange chromatography, and immuno-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified 
from cells which have been altered to express the desired polypeptide or protein. 
As used herein, a cell is said to be altered to express a desired polypeptide or 
protein when the cell, through genetic manipulation, is made to produce a 

20 polypeptide or protein which it normally does not produce or which the cell 
normally produces at a lower level. Those skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic 
sequences into eukaryotic or prokaryotic cells in order to generate a cell which 
produces one of the polypeptides or proteins of the present invention. 

25 Any host/vector system can be used to express one or more of the ORFs of 

the present invention. These include, but are not limited to, eukaryotic hosts such 
as HeLa cells, CV-1 cell, COS cells, and Sf? cells, as well as prokaryotic host 
such as coli and B. suhtilis. The most preferred cells are those which do not 
normally express the particular polypeptide or protein or which expresses the 

30 polypeptide or protein at low natural level. 



wo 98/18931 



24 



PCT/US97/19588 



"Recombinant," as used herein, means that a polypeptide or protein is 
derived from recombinant {e.g., microbial or mammalian) expression systems. 
"Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal {e.g., yeast) expression systems. As a product, "recombinant 
5. microbial-defmes a polypeptide or protein essentially. free of native endogenous 
substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. coli, will be free of 
glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern different from that expressed in mammalian cells. 

10 "Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. 

Generally, DNA segments encoding the polypeptides and proteins provided by this 
invention are assembled from fragments of the Streptococcus pneumoniae genome 
and shon oligonucleotide linkers, or from a series of oligonucleotides, to provide a 
synthetic gene which is capable of being expressed in a recombinant transcriptional 

15 unit comprising regulatory elements derived from a microbial or viral operon. 

Recombinant expression vehicle or vector" refers to a plasmid or phage or 
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. The 
expression vehicle can comprise a transcriptional unit comprising an assembly of 
(1) a genetic regulatory elements necessary for gene expression in the host, 

20 including elements required to initiate and maintain transcription at a level sufficient 
for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a 
structural or coding sequence which is transcribed into mRNA and translated into 
protein, and (3) appropriate signals to initiate translation at the beginning of the 

25 desired coding region and terminate u-anslation at its end. Structural units intended 
for use in yeast or eukaryotic expression systems preferably include a leader 
sequence enabling extracellular secretion of translated protein by a host cell. 
Alternatively, where recombinant protein is expressed without a leader or transport 
sequence, it may include an N-terminal methionine residue. This residue may or 

30 may not be subsequently cleaved from the expressed recombinant protein to 
provide a final product. 

"Recombinant expression system" means host cells which have stably 
integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic 

35 or eukaryotic. Recombinant expression systems as defined herein will express 
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heterologous polypeptides or proteins upon induction of the regulatory elements 
linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 
other cells under the control of appropriate promoters. Cell-free translation 
5 systems can also be employed to produce such proteins using RNAs derived from 
the DNA constructs of the present invention. Appropriate cloning and expression 
vectors for use with prokaryotic and eukaryolic hosts are described in Sambrook et 
aL, Molecular Cloning: A Laboratory Manual 2"^ Edition, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which 

10 is hereby incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of 
replication and selectable markers permitting transformation of the host cell, e.g., 
the ampicillin resistance gene of £. coli and 5. cerevisiae TRPl gene, and a 
promoter derived from a highly expressed gene to direct transcription of a 

15 downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3- phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous 
structural sequence is assembled in appropriate phase with translation initiation and 
termination sequences, and preferably, a leader sequence capable of directing 

20 secretion of translated protein into the periplasmic space or extracellular medium. 
Optionally, the heterologous sequence can encode a fusion protein including an N- 
terminai identification peptide imparting desired characteristics, e.g., stabilization 
or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a 

25 structural DNA sequence encoding a desired protein together with suitable 
translation initiation and termination signals in operable reading phase with a 
fiincdonal promoter. The vector will comprise one or more phenotypic selectable 
markers and an origin of replication to ensure maintenance of the vector and, when ■ 
desirable, provide amplification within the host. 

30 Suitable prokaryotic hosts for transformation include strains of E. coli, B. 

subtilis. Salmonella typhimurium and various species within the genera 
Pseudomonas and Streptomyces, Others may, also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 
35 bacterial use can comprise a selectable marker and bacterial origin of replication 
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derived from commercially available plasmids comprising genetic elements of the 
well known cloning vector pBR322 (ATCC 37017). Such commercial vectors 
include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, 
Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, WI, 
5 USA). These pBR322 "backbone" sections are combined with an appropriate 
promoter and the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host 
strain to an appropriate cell density, the selected promoter, where it is inducible, is 
derepressed or induced by appropriate means (e,g,, temperature shift or chemical 

10 induction) and cells are cultured for an additional period to provide for expression 
of the induced gene product. Thereafter cells are typically harvested, generally by 
ccntrifugaiion, disrupted to release expressed protein, generally by physical or 
chemical means, and the resulting crude extract is retained for further purification. 
Various mammalian cell culture systems can also be employed to express 

15 recombinant protein. Examples of mammalian expression systems include the 
COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23:175 
( 1 98 1 ), and other cell lines capable of expressing a compatible vector, for example, 
the C 1 27, 3T3. CHO, HeLa and BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a 

20 suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking nontranscribcd sequences. DNA sequences derived 
from the S V40 viral genome, for example, S V40 origin, early promoter, enhancer, 
splice, and polyadenylation sites may be used to provide the required 

25 nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is 
usually isolated by initial extraction from cell pellets, followed by one or more 
salting-out, aqueous ion exchange or size exclusion chromatography steps. 
Microbial cells employed in expression of proteins can be disrupted by any 

30 convenient method, including freeze-thaw cycling, sonication, mechanical 
disruption, or use of cell lysing agents. Protein refolding steps can be used, as 
necessary, in completing configuration of the mature protein. Finally, high 
performance liquid chromatography (HPLC) can be employed for final purification 
steps. 
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The present invention further includes isolated polypeptides, proteins and 
nucleic acid molecules which are substantially equivalent to those herein described. 
As used herein, substantially equivalent can refer both to nucleic acid and amino 
acid sequences, for example a mutant sequence, that varies from a reference 
5 sequence by one or more substitutions, deletions, or additions, the net effect of 
which does not result in an adverse functional dissimilarity between reference and 
subject sequences. For purposes of the present invention, sequences having 
equivalent biological activity, and equivalent expression characteristics arc 
considered substantially equivalent. For purposes of determining equivalence, 

10 truncation of the mature .sequence should be disregarded. 

The invenlTon further provides methods of obtaining homologs from other 
strains of Streptococcus pneumoniae, of the fragments of the Streptococcus 
pneumoniae genome of the present invention and homologs of the proteins encoded 
by the ORFs of the present invention. As used herein, a sequence or protein of 

15 Streptococcus pneumoniae is defined as a homolog of a fragment of the 
Streptococcus pneumoniae fragments or contigs or a protein encoded by one of the 
ORFs of the present invention, if it shares significant homology to one of the 
fragments of the Streptococcus pneumoniae genome of the present invention or a 
protein encoded by one of the ORFs of the present invention. Specifically, by 

20 using the sequence disclosed herein as a probe or as primers, and techniques such 
as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain 
homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share 
significant homology" if the two contain regions which possess greater than 85% 

25 sequence (amino acid or nucleic acid) homology. Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those 
with 93% or more homology. Among especially preferred homologs those with 
95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those 

30 are homologs with 99% or more homology. The most preferred homologs among 
these are those with 99.9% homology or more. It will be understood that, among 
measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence 
provided in SEQ ID NOS: 1-391 or from a nucleotide sequence at least 95%, 

35 particularly at least 99%;, especially at least 99.5% identical to a sequence of SEQ 



wo 98/18931 



28 



PCT/US97/19588 



ID NOS: 1-391 can be used to prime DNA synthesis and PCR amplification, as 
well as to identify colonies containing cloned DNA encoding a homolog. Methods 
suitable to this aspect of the present invention are well known and have been 
described in great detail in many publications such as, for example. Innis et ai, 
5 PCR Protocols, Academic Press, San Diego, CA ( 1 990)). 

When using primers derived from SEQ ID NOS: 1 -391 or from a nucleotide 
sequence having an aforementioned identity to a sequence of SEQ ID NOS: 1-391, 
one skilled in the art will recognize that by employing high stringency conditions 
(e.^., annealing at 50-60^C in 6X SSPC and 50% formamide, and washing at 50- 

10 in 0.5X SSPC) only sequences which are greater than 75% homologous to 

the primer will be amplified. By employing lower stringency conditions {e.g,, 
hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C 
in 0.5X SSPC), sequences which are greater than 40-50% homologous to the 
primer will also be amplified. 

15 When using DNA probes derived from SEQ ID NOS: 1-391, or from a 

nucleotide sequence having an aforementioned identity to a sequence of SEQ ID 
NOS:lr391. for colony/plaque hybridization, one skilled in the an will recognize 
that by employing high stringency conditions (e.g., hybridizing at 50- 65^C in 5X 
SSPC and 50% formamide, and washing at 50- 65^C in 0.5X SSPC), sequences 

20 having regions which are greater than 90% homologous to the probe can be 
obtained, and that by employing lower stringency conditions {e,g., hybridizing at 
35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X 
SSPC), sequences having regions which are greater than 35-45% homologous to 
die probe will be obtained. 

25 Any organism can be used as the source for homologs of the present 

invention so long as the organism naturally expresses such a protein or contains 
genes encoding the same. The most preferred organism for isolating homologs are 
bacteria which are closely related to Streptococcus pneumoniae. 

30 ILLUSTRATIVE USES OF COMPOSITIONS OF THE 

INVENTION 

Each ORF provided in Tables 1 and 2 is identified with a function by 
homology to a known gene or polypeptide. As a result, one skilled in the art can 
use the polypeptides of the present invention for commercial, therapeutic and 
35 industrial purposes consistent with the type of putative identification of the 
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polypeptide. Such identifications permit one skilled in the art to use the 
Streptococcus pneumoniae ORFs in a manner similar to the known type of 
sequences for which the identification is made; for example, to ferment a particular 
sugar source or to produce a particular metabolite. A variety of reviews illustrative 
5 of this aspect of the invention are available, including the following reviews on the 
industrial use of enzymes, for example, BIOCHEMICAL ENGINEERING AND 
BIOTECHNOLOGY HANDBOOK, 2nd Ed., MacMillan Publications, Ltd. NY 
(1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et al, Eds., 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). A variety of 
10 exemplary uses that illustrate this and similar aspects of the present invention are 
discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic 

15 reactions involved in intermediary and macromolecular metabolism, the 
biosynthesis of small molecules, cellular processes and other functions includes 
enzymes involved in the degradation of the intermediary products of metabolism, 
enzymes involved in central intermediary metabolism, enzymes involved in 
respiration, both aerobic and anaerobic, enzymes involved in fermentation, 

20 enzymes involved in ATP proton motor force conversion, enzymes involved in 
broad regulatory function, enzymes involved in amino acid synthesis, enzymes 
involved in nucleotide synthesis, enzymes involved in cofactor and vitamin 
syntiiesis, can be used for industrial biosynthesis. 

The various metabolic pathways present in Streptococcus pneumoniae can 

25 be identified based on absolute nutritional requirements as well as by examining the 
various enzymes identified in Table 1-3 and SEQ ID NOS: 1-391. 

Of particular interest are polypeptides involved in tiie degradation of 
intermediary metabolites as well as non-macromolecular metabolism. Such 
enzymes include amylases, glucose oxidases, and catalase. 

30 Proteolytic enzymes are another class of commercially important enzymes. 

Proteolytic enzymes find use in a number of industrial processes including the 
processing of flax and other vegetable fibers, in tiie extraction, clarification and 
depectinization of finit juices, in the extraction of vegetables* oil and in the 
maceration of fruits and vegetables to give unicellular fruits. A detailed review of 

35 the proteolytic enzymes, used in the food industry is provided in Rombouts et aL, 
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Symbiosis 21:19 (1986) and Voragen et al in Biocatalysts In Agricultural 
Biotechnology, Whitaker et al, Eds., American Chemical Society Symposium 
Series 389:93(1989). 

The metabolism of sugars is an important aspect of the primary metabolism 
5 of Streptococcus pneumoniae. Enzymes involved in the degradation of sugars, 
such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from 
a commercial viewpoint, include sugar isomerases such as glucose isomerase. 
Other metabolic enzymes have found commercial use such as glucose oxidases 

10 which produces ketogulonic acid (KGA). KGA is an intermediate in the 
commercial production of ascorbic acid using the Reichstein's procedure, as 
described in Krueger et ai. Biotechnology 6(A). Rhine et ai, Eds., Verlag Press, 
Weinheim, Germany (1984). 

Glucose oxidase (GOD) is commercially available and has been used in 

15 purified form as well as in an inunobilized form for the deoxygenation of beer. 
See, for instance, Hartmeir et al. Biotechnology Letters 7:21 (1979). The most 
important application of GOD is the industrial scale fermentation of gluconic acid. 
Market for gluconic acids which are used in the detergent, textile, leather, 
photographic, pharmaceutical, food, feed and concrete industry, as described, for 

20 example, in Bigelis et al, beginning on page 357 in GENE MANIPULATIONS 
AND FUNGI; Benett et al, Eds., Academic Press, New York (1985). In addition 
to industrial applications, GOD has found applications in medicine for quantitative 
determination of glucose in body fluids recently in biotechnology for analyzing 
syrups from starch and cellulose hydrosylates. This application is described in 

25 Owusu et al, Biochem, et Biophysica, Acta, 872:%3 (1986), for instance. 

The main sweetener used in the world today is sugar which comes from 
sugar beets and sugar cane. In the field of industrial enzymes, the glucase 
isomerase process shows the largest expansion in the market today. Initially, 
soluble enzymes were used and later immobilized enzymes were developed 

30 (Krueger et al. Biotechnology, The Textbook of Industrial Microbiology, Sinauer 
Associated Incorporated, Sunderland, Massachusetts (1990)). Today, die use of 
glucose- produced high fructose syrups is by far die largest industrial business 
using immobilized enzymes. A review of the indusu-ial use of these enzymes is 
provided by Jorgensen, Starch ^0:307 (1988). 
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Proteinases, such as alkaline serine proteinases, are used as detergent 
additives and thus represent one of the largest volumes of microbial enzymes used 
in the industrial sector. Because of their industrial importance, there is a large body 
of published and unpublished information regarding the use of these enzymes in 
5 industrial processes. (See Faultman et al. Acid Proteases Structure Function and 
Biology, Tang, J., ed.. Plenum Press, New York (1977) and Godfrey et aL, 
Industrial Enzymes, MacMillan Publishers. Surrey, UK (1983) and Hepner et aL 
Repon Industrial Enzymes by 1990, Hel Hepner & Associates, London (1986)). 

Another class of commercially usable proteins of the present invention are 

10 the nnicrobial lipases, described by, for instance. Macrae et aL, Philosophical 
Transactions of the Chiral Society of London 310:221 (1985) and Poserke, Journal 
of the Aniirican Oil Chemist Society 67; 1758 (1984). A major use of lipases is in 
the fat and oil industry for the production of neutral glycerides using lipase 
catalyzed inter-esterification of readily available triglycerides. Application of 

15 lipases include the use as a detergent additive to facilitate the removal of fats from 
fabrics in the course of the washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for 
key steps in the synthesis of complex organic molecules is gaining popularity at a 
great rate. One area of great interest is the preparation of chiral intermediates. 

20 Preparation of chiral intermediates is of interest to a wide range of synthetic 
chemists particularly those scientists involved with the preparation of new 
pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et aL, Recent 
Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, 
Boca Raton, Florida (1990)). The following reactions catalyzed by enzymes arc of 

25 interest to organic chemists: hydrolysis of carboxylic acid esters, phosphate esters, 
aniides and nitriles, esterification reactions, u^s-esterification reactions, synthesis 
of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to 
carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming 
reactions such as the aldol reaction. 

30 When considering the use of an enzyme encoded by one of the ORFs of the 

present invention for biotransformation and organic synthesis it is sometimes 
necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole 
cell system on the one hand or an isolated partially purified enzyme on the other 
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hand, has been described in detail by Bud et aL, Chemistry in Britain (1987), p. 
127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism 
of amino acids, are useful in the catalytic production of amino acids. The 
5 advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and 
generally possess uniformly high catalytic rates. A description of the use of amino 
transferases for amino acid production is provided by Roselle-David, Methods of 
Enzymology 136:479 (1987). 
10 Another category of useful proteins encoded by the ORFs of the present 

invention include enzymes involved in nucleic acid synthesis, repair, and 
recombination. 

2» Generation of Antibodies 

15 As described here,' the proteins of the present invention, as well as 

homologs thereof, can be used in a variety of procedures and methods known in 
the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the 
protein. Such antibodies can be either monoclonal or polyclonal antibodies, as well 

20 fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of 
the proteins of the present invention and hybridomas which produce these 
antibodies. A hybridoma is an immortalized cell line which is capable of secreting 
a specific monoclonal antibody. 

25 In general, techniques for preparing polyclonal and monoclonal antibodies 

as well as hybridomas capable of producing the desired antibody are well known in • 
the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory 
Techniques In Biochemistry And Molecular Biology, Elsevier Science Publishers, 
Amsterdam, The Netheriands (1984); St. Groth et ai, J. Immunol Methods 35: 1- 

30 21 (1980), Kohler and Milstein. Nature 256:495-497 (1975)), the trioma 
technique, the human B-cell hybridoma technique (Kozbor et al. Immunology 
Today 4:11 (1983), pgs. 77-96 of Cole et ai, in Monoclonal Antibodies And 
ConccrT/ierapy, Alan R.Liss, Inc. (1985)). Any animal (mouse, rabbit, 

etc.) which is known to produce antibodies can be immunized with the pseudogene 

35 polypeptide. Methods for immunization are well known in the art. Such methods 
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include subcutaneous or interperitoneal injection of the polypeptide. One skilled in 
the art will recognize that the amount of the protein encoded by the ORF of the 
present invention used for immunization will vary based on the animal which is 
immunized, the antigenicity of the peptide and the site of injection. 
5 The protein which is used as an immunogen may be modified or 

administered in an adjuvant in order to increase the protein's antigenicity. Methods 
of mcreasing the antigenicity of a protein are well known in the art and include, but 
are not limited to coupling the antigen widi a heterologous protein (such as globulin 
or galactosidase) or through the inclusion of an adjuvant during immunization. 

10 For monoclonal antibodies, spleen cells from the immunized animals are 

removed, fused with myeloma cells, such as SP2/0-Agl4 myeloma cells, and 
allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to 
identify the hybridoma cell which produces an antibody with the desired 

15 characteristics. These include screening the hybridomas with an ELISA assay, 
western blot analysis, or radioimmunoassay (Lutz et aL, Exp. Cell Res. 175:109- 
124(1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and 
subclass is determined using procedures known in the art (Campbell, A. M., 
20 Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and 
Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
(1984)). 

Techniques described for the production of single chain antibodies (U, S . 
Patent 4,946,778) can be adapted to produce single chain antibodies to proteins of 
25 the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the 
immunized animal and is screened for the presence of antibodies with the desired 
specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in 
30 detectably labelled form. Antibodies can be detectably labelled through the use of 
radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels (such 
as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as 
FTTC or rhodamine, etc.), paramagnetic atoms, etc. Procedures for accomplishing 
such labeling are well-known in the art, for example see Stemberger et al, J, 
35 Histochem. Cytochem. 18:315 (1970); Bayer, E. A. et aL, Meth. Enzym. 62:308 
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(1979); Engval, E. et aL, Immunol 109:129 (1972); Coding, J. W., 7. Immunol. 
Meth. 75:215(1976)). 

The labeled antibodies of the present invention can be used for in vitro, in 
vivo, and in situ assays to identify cells or tissues in which a fragment of the 
5 Streptococcus pneumoniae genome is expressed. * 

The present mvention further provides the above-described antibodies 
immobilized on a solid support. Examples of such solid supports include plastics 
such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for 

10 coupling antibodies to such solid supports are well known in the art (Weir, D. M. 
et aL, "Handbook of Experimental Immunology" 4th Ed., Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. et aL, Meth. 
Enzym. 34 Academic Press, N. Y. (1974)). The inunobilized antibodies of the 
present invention can be used for in vitro, in vivo, and in situ assays as well as for 

15 inununoaffinity purification of the proteins of the present invention. 

3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression 
of one of the ORFs of the present invention, or homolog thereof, in a test sample, 

20 using one of die DFs or antibodies of the present invention. 

In detail, such methods comprise incubating a test sample with one or more 
of the antibodies or one or more of the DFs of the present invention and assaying 
for binding of the DFs or antibodies to components within the test sample. 

Conditions for incubating a DF or antibody with a test, sample vary. 

25 Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the DF or antibody used in the 
assay. One skilled in the art will recognize diat any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted 
to employ the DFs or antibodies of the present invention. Examples of such assays 

30 can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Nethedands (1986); 
Bullock, G. R, et ai. Techniques in Immunocytochemistry, Academic Press, 
Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and 
Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and 
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Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
(1985). 

The test samples of the present invention include cells, protein or membrane 
extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 
5 urine. The test sample used in the above-described method will vary based on the 
assay format, nature of the detection method and the tissues, cells or extracts used 
as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to 
obtain a sample which is compatible with the system utilized. 
10 In another embodiment of the present invention, kits are provided which 

contain the necessary reagents to carry out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in 
close confinement, one or more containers which comprises: (a) a first container 
comprising one of the DFs or antibodies of the present invention; and (b) one or 
15 more other containers comprising one or more of the following: wash reagents, 
reagents capable of detecting presence of a bound DF or antibody. 

In detail, a compartmentalized kit includes any kit in which reagents are 
contained in separate containers. Such containers include small glass containers, 
plastic containers or strips of plastic or paper. Such containers allows one to 
20 efficiently transfer reagents from one compartment to another compartment such 
that the samples and reagents are not cross-contaminated, and the agents or 
solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept 
the test sample, a container which contains the antibodies used in the assay, 
25 containers which contain wash reagents (such as phosphate buffered saline, Tris- 
buffers, etc.), and containers which contain the reagents used to detect the bound 
antibody or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled 
secondary antibodies, or in the alternative, if the primary antibody is labelled, the 
30 enzymatic, or antibody binding reagents which are capable of reacting with the 
labelled antibody. One skilled in the art will readily recognize that the disclosed 
DFs and antibodies of the present invention can be readily incorporated into one of 
the established kit formats which are well known in the art. 

35 4. Screening. Assay f r Binding Agents 



wo 98/18931 



36 



PCT/US97/I9588 



Using the isolated proteins of the present invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a 
protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Streptococcus pneumoniae fragment and contigs herein 
5 described. 

In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the 
ORFs of the present invention, or an isolated fragment of the Streptococcus 
pneumoniae genome; and 

10 (b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, 
peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The 
agents can be selected and screened at random or rationally selected or designed 
using protein modeling techniques. 

>5 For random screening, agents such as peptides, carbohydrates, 

pharmaceutical agents and the like are selected at random and arc assayed for their 
ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used 
herein, an agent is said to be "rationally selected or designed" when the agent is 

20 chosen based on the configuration of the particular protein. For example, one 
skilled in the art can readily adapt currently available procedures to generate 
peptides, pharmaceutical agents and the like capable of binding to a specific peptide 
sequence in order to generate rationally designed antipeptide peptides, for example 
see Hurby et aL, "Application of Synthetic Peptides: Antisense Peptides," in 

25 Synthetic Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, 
and Kaspczak et aL, Biochemistry 28:9230-^ (1989). or pharmaceutical agents, or 
the like. 

In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used to control gene expression through binding to one 
30 of the ORFs or EMFs of the present invention. As described above, such agents 
can be randomly screened or rationally designed/selected. Targeting the ORF or 
EMF allows a skilled artisan to design sequence specific or element specific agents, 
modulating the expression of either a single ORF or multiple ORFs which rely on 
the same EMF for expression control. 
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One class of DNA binding agents are agents which contain base residues 
which hybridize or form a triple helix by binding to DNA or RNA. Such agents 
can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a 
variety of sulfhydryl or polymeric derivatives which have base attachment capacity. 
5 Agents suitable for use in these methods usually contain 20 to 40 bases and 

are designed to be complementaiy to a region of the gene involved in transcription 
(triple helix - see Lee et al, NucL Acids Res. 6:3073 (1979); Cooney et al. 
Science 241:456 (1988); and Dervan et ai. Science 257:1360 (1991)) or to the 
mRNA itself (antisense • Okano, J. Neurochem. 55/560 (1991); 

10 Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). Triple helix- formation optimally results in a shut-off of 
RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems* Information contained in the 

15 sequences of the present invention can be used to design antisense and uiple helix- 
forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccines 

The present invention further provides pharmaceutical agents which can be 

20 used to modulate the growth or pathogenicity of Streptococcus pneumoniae, or 
another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known 
techniques to provide a pharmaceutical compositions. As used herein, the 
"pharmaceutical agents of the present invention" refers the pharmaceutical agents 

25 which are derived from the proteins encoded by the ORFs of the present invention 
or are agents which are identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth 
pathogenicity of Streptococcus pneumoniae or a related organism, in vivo or in 
vitro'' when the agent reduces the rate of growth, rate of division, or viability of 

30 the organism in question. The pharmaceutical agents of the present invention can 
modulate the growth or pathogenicity of an organism in many fashions, although 
an understanding of the underiying mechanism of action is not needed to practice 
the use of the pharmaceutical agents of the present invention. Some agents will 
modulate the growth by binding to an important protein thus blocking the biological 

35 activity of the protein, while other agents may bind to a component of die outer 
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surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a 
protein encoded by one of the ORFs of the present invention and serve as a 
vaccine. The development and use of a vaccine based on outer membrane 
5 components are well known in the art. 

As used herein, a "related organism" is a broad term which refers to any 
organism whose growth can be modulated by one of the pharmaceutical agents of 
the present invention. In general, such an organism will contain a homolog of the 
protein which is the target of the pharmaceutical agent or the protein used as a 
10 vaccine. As such, related organisms do not need to be bacterial but may be fungal 
or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may 
be administered in a convenient manner, such as by the oral, topical, intravenous, 
intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes. The 

15 pharmaceutical compositions are administered in an amount which is effective for 
treating and/or prophylaxis of the specific indication. In general, they are 
administered in an amount of at least about 1 mg/kg body weight and in most cases 
they will be administered in an amount not in excess of about 1 g/kg body weight 
per day. In most cases, the dosage is from about 0. 1 mg/kg to about 10 g/kg body 

20 weight daily, taking into account the routes of administration, symptoms, etc. 

The agents of the present invention can be used in native form or can be 
modified to form a chemical derivative. As used herein, a molecule is said to be a 
"chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the 

25 molecule's solubility, absorption, biological half life, etc. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, etc. Moieties capable of mediating such 
effects are disclosed in, among other sources, REMINGTON'S 
PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein. 

30 For example, such moieties may change an immunological character of the 

functional derivative, such as affinity for a given antibody. Such changes in 
immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox 
or thermal stability, biological half-life, hydrophobicity, susceptibility to proteolytic 

35 degradation or the tendency to aggregate with carriers or into multimers also may 
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be effected in this way and can be assayed by methods well known to the skilled 
artisan. 

The therapeutic effects of the agents of the present invention may be 
obtained by providing the agent to a patient by any suitable means (e.g., inhalation, 
5 intravenously, intramuscularly, subcutaneously, enterally, or paienterally). It is 
preferred to administer the agent of the present invention so as to achieve an 
effective concentration within the blood or tissue in which the growth of the 
organism is to be controlled. To achieve an effective blood concentration, the 
preferred method is to administer the agent by injection. The administration may be 

10 by continuous infusion, or by single or multiple injections. 

In providing a patient with one of the agents of the present invention, the 
dosage of the administered agent will vary depending upon such factors as the 
patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of 

15 agent which is in the range of from about 1 pg/kg to 10 mg/kg (body weight of 
patient), although a lower or higher dosage may be administered. The 
therapeutically effective dose can be lowered by using combinations of the agents 
of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be 

20 administered "in combination" with each other when either (1) the physiological 
effects of each compound, or (2) the serum concentrations of each compound can 
be measured at the same time. The composition of the present invention can be 
administered concurrently with, prior to, or following the administration of the 
other agent. 

25 The agents of the present invention are intended to be provided to recipient 

subjects in an amount sufficient to decrease the rate of growth (as defined above) of 
the target organism. 

The administration of the agent(s) of the invention may be for either a 
"prophylactic" or "therapeutic" purpose. When provided prophylacticaily, the 

30 agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, 
attenuate, or decrease the rate of onset of any subsequent infection. When 
provided therapeutically, the agent(s) are provided at (or shortly after) the onset of 
an indication of infection. The therapeutic administration of the compound(s) 
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serves to attenuate the pathological symptoms of the infection and to increase the 
rate of recovery. 

The agents of the present invention are administered to a subject, such as a 
mammal, or a patient, in a pharmaceutically acceptable form and in a therapeutically 
5 effective concentration. A composition is said to be "pharmacologically acceptable" 
if its administration can be tolerated by a recipient patient. Such an agent is said to 
be administered in a "therapeutically effective amount" if the amount administered 
is physiologically significant. An agent is physiologically significant if its presence 
results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known 
methods to prepare pharmaceutically useful compositions, whereby these materials, 
or their functional derivatives, arc combined in a mixture with a pharmaceutically 
acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of 
other human proteins, e.g,, human serum albumin, are described, for example, in 

15 REMINGTON^S PHARMACEUTICAL SCIENCES, 16* Ed., Osol, A., Ed., 
Mack Publishing, Easton PA (1980). In order to form a pharmaceutically 
acceptable composition suitable for effective adminisu*alion, such compositions will 
contain an effective amount of one or more of the agents of the present invention, 
together with a suitable amount of carrier vehicle. 

20 Additional pharmaceutical methods may be employed to control the duration 

of action. Control release preparations may be achieved through the use of 
polymers to complex or absorb one or more of the agents of the present invention. 
The controlled delivery may be effecmated by a variety of well known techniques, 
including formulation with macromolecules such as, for example, polyesters, 

25 polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, 
carboxymediylcellulose, or protamine, sulfate, adjusting the concentration of the 
njacromolecules and the agent in the formulation, and by appropriate use of 
methods of incorporation, which can be manipulated to effectuate a desired time 
course of release. Another possible method to control the duration of action by 

30 controlled release preparations is to incorporate agents of the present invention into 
particles of a polymeric material such as polyesters, polyamino acids, hydrogels, 
poly(lacdc acid) or ethylene vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is possible to entrap these 
materials in microcapsules prepared, for example, by coacervation techniques or by 

35 interfacial polymerization with, for example, hydroxymethylcellulose or gelatine- 
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microcapsules and poly(niethylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, albumin microspheres, 
microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such 
techniques are disclosed in REMINGTON*S PHARMACEUTICAL SCIENCES 
5 (1980). 

The invention further provides a pharmaceutical pack or kit comprising one 
or more containers filled with one or more of the ingredients of the pharmaceutical 
compositions of the invention. Associated with such container(s) can be a notice in 
the form prescribed by a governmental agency regulating the manufacture, use or 
10 sale of pharmaceuticals or biological products, which notice reflects approval by 
the agency of manufacture, use or sale for human administration. 

In addition, the agents of the present invention may be employed in 
conjunction with other therapeutic compounds. 

13 6. Shot-Gun Approach to Megabase DNA Sequencing 

The present invention further demonstrates that a large sequence can be 
sequenced using a random shotgun approach. This procedure, described in detail 
in the examples that follow, has eliminated the up front cost of isolating and 
ordering overlapping or contiguous subclones prior to the start of the sequencing 
20 protocols. 

Certain aspects of the present invention are described in greater detail in the 
examples that follow. The examples are provided by way of illustration. Other 
aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present 
25 disclosure. 



ILLUSTRATIVE KYAMPI.R^ 

LIBRARIES AND SEQUENCING 
30 1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing 

follows from the Lander and Waterman (Landerman and Waterman, Genomics 

2. 23 1 (1988)) application of the equation for the Poisson distribution. Accoitiing 

to this treatment, the probability, P , that any given base in a sequence of size L, in 

35 nucleotides, is not sequenced after a certain amount, n, in nucleotides, of random 

0 
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sequence has been detemiined can be calculated by the equation P = e""^, where m 
is L/n, the fold coverage. For instance, for a genome of 2.8 Mb, m=l when 2.8 
Mb of sequence has been randomly generated ( IX coverage). APlhat point, P = 
e-J = 0.37. The probability that any given base has not been sequenced is the same 

5 as the probability that any region of the whole sequence L has not been determined 
and. therefore, is equivalent to the fraction of the whole sequence that has yet to be 
determined. Thus, at one-fold coverage, approximately 37% of a polynucleotide of 
size L, in nucleotides has not been sequenced. When 14 Mb of sequence has been 
generated, coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to 

10 .0067 or 0.67%. 5X coverage of a 2.8 Mb sequence can be attained by sequencing 
approximately 17,000 random clones from both insert ends with an average 
sequence read length of 410 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le-^^, 
and the average gap size, g, follows the equation, g = L/n. Thus, 5X coverage 

15 leaves about 240 gaps averaging about 82 bp in size in a sequence of a 
polynucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 
2:231 (1988). 

20 2. Random Library Construction 

In order to approximate the random model described above during actual 
sequencing, a nearly ideal library of cloned genomic fragments is required. The 
following library construction procedure was developed to achieve this end. 

Streptococcus pneumoniae DNA is prepared by phenol extraction. A 

25 mixture containing 200 jig DNA in 1 .0 ml of 300 mM sodium acetate, 10 mM Tris- 
HCl, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical 
Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The 
sonicated DNA is ethanol precipitated and redissolved in 500 jil TE buffer. 

To create blunt-ends, a 100 |il aliquot of the resuspended DNA is digested 

30 with 5 units of B AL3 1 nuclease (New England BioLabs) for 10 min at 30°C in 200 
|ll BAL31 buffer. The digested DNA is phenol-extracted, ethanol-prccipitated, 
redissolved in 100 jll TE buffer, and then size-fractionated by electrophoresis 
through a 1.0% low melting temperature agarose gel. The section containing DNA 
fragments 1 .6-2.0 kb in size is excised from the gel, and the LGT agarose is melted 

35 and the resulting solution is extracted with phenol to separate the agarose from the 
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DNA, DNA is ethanol precipitated and redissolved in 20 p.1 of TE buffer for 
ligation to vector. 

A two-step ligation procedure is used to produce a plasmid library with 
97% inserts, of which >99% were single inserts. The first ligation mixture (50 ui) 
5 contains 2 \ig of DNA fragments, 2 ^ig pUC18 DNA (Pharmacia) cut with Smal 
and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase 
(GBCO/BRL) and is incubated at 14°C for 4 hr. The ligation mixture then is 
phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 
20 \il TE buffer and electrophoresed on a 1 .0% low melting agarose gel. Discrete 

10 bands in a ladder are visualized by ethidium bromide-staining and UV illumination 
and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc. The portion of 
the gel containing v+I DNA is excised and the v+I DNA is recovered and 
resuspended into 20 ^il TE. The v+I DNA then is blunt-ended by T4 polymerase 
treatment for 5 min. at 3TC in a reaction mixture (50 ul) containing the v+I linears, 

15 500 jiM each of the 4 dNTPs, and 9 units of T4 polymerase (New England 
BioLabs), under reconunended buffer conditions. After phenol extraction and 
ethanol precipitation the repaired v+I linears are dissolved in 20 \il TE. The final 
ligation to produce circles is carried out in a 50 fil reaction containing 5 \il of v+I 
linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at TO'^C the 

20 following day, the reaction mixture is stored at -20°C. 

This two-stage procedure results in a molecularly random collection of 
single-insert plasmid recombinants with minimal contamination from double-insert 
chimeras (< 1 %) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in 

25 the host, E. coli host cells deficient in all recombination and restriction functions 
(A. Greener, Strategies 3 (1}:5 (1990)) are used to prevent rearrangements, 
deletions, and loss of clones by restriction. Furthermore, transformed cells arc 
plated directly on antibiotic diffusion plates to avoid the usual broth recovery phase 
which allows multiplication and selection of the most rapidly growing cells. 

30 Plating is carried out as follows. A 100 ^il aliquot of Epicurian Coli SURE 

II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a 
chilled Falcon 2059 tube on ice. A 1.7 jil aliquot of 1.42 M beta-mercaptoethanol 
is added to the aliquot of cells to a final concentration of 25 mM. Cells are 
incubated on ice for 10 min. A 1 ^1 aliquot of the final ligation is added to the cells 

35 and mcubated on ice for 30 min. The cells are heat pulsed for 30 sec. at 42**C and 
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placed back on ice for 2 min. The outgrowth period in liquid culture is eliminated 
from this protocol in order to minimize the preferential growth of any given 
transformed cell. Instead the transformation mixture is plated directly on a nutrient 
rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g 
5 tryptone, 5 g yeast extract. 0.5 g NaCl, 1 .5% Difco Agar per liter of media). The 5 
ml bottom layer is supplemented with 0.4 mi of 50 mg/ml ampicillin per 100 ml 
SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal 
(2%), 1 ml MgCl (1 M), and 1 ml MgSO /lOO ml SOB agar. The 15 ml top layer 
is poured just prior to plating. Our titer is approximately 1 00 colonies/10 ^il aliquot 
10 of transformation? ^ 

All colonies are picked for template preparation regardless of size. Thus, 
only clones lost due to "poison" DNA or deleterious gene products are deleted from 
the library, resulting in a slight increase in gap number over that expected. 

15 3. Random DNA Sequencing 

High quality double stranded DNA plasmid templates are prepared using a 
"boiling bead" method developed in collaboration with Advanced Genetic 
Technology Corp. (Gaithersburg, MD) (Adams et ai. Science 252:1651 (1991); 
Adams et ai , Nature 555:632 ( 1992)). Plasmid preparation is performed in a 96- 

20 well format for all stages of DNA preparation from bacterial growth through final 
DNA purification. Template concentration is detemiined using Hoechst Dye and a 
Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding 
templates are identified where possible and not sequenced. 

Templates arc also prepared from two Streptococcus pneurhoniae lambda 

25 genomic libraries. An amplified library is constructed in the vector Lambda GEM- 
12 (Promega) and an unamplified library is constructed in Lambda DASH II 
(Stratagene). In particular, for the unamplified lambda library, Streptococcus 
pneumoniae DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) 
containing 50 jig DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C. 

30 The digested DNA was phenol-extracted and electrophoresed on a 0.5% low 
melting agarose gel at 2V/cm for 7 hours. Fragments from 15 to 25 kb are excised 
and recovered in a final volume of 6 ul. One ^1 of fragments is used with 1 ^il of 
DASHII vector (Stratagene) in the recommended ligation reaction. One ^il of the 
ligation mixture is used per packaging reaction following the recommended 

35 protocol with the Gigapack 11 XL Packaging Extract (Stratagene, #2277 1 1 ). Phage 
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are plated directly without amplification from the packaging mixture (after dilution 
with 500 lii of recommended SM buffer and chloroform treatment). Yield is about 
2.5xl03 pfu/ul. The amplified library is prepared essentially as above except the 
lambda GEM- 12 vector is used. After packaging, about 3.5xl04 pfu are plated on 
5 the restrictive NM539 host. The lysate is harvested in 2 ml of SM buffer and 
stored frozen in 7% dimethylsulfoxide. The phage liter is approximately 1x10^ 
pfu/ml. 

Liquid lysates (100 |ll) are prepared from randomly selected plaques (from 
the unamplified library) and template is prepared by long-range PGR using T7 and 
10 T3 vector-specific primers.. 

Sequencing reactions are carried out on plasmid and/or PGR templates 
using the AB Catalyst LabStation with Applied Biosystems PRISM Ready 
Reaction Dye Primer Gycle Sequencing Kits for the M13 forward (M13-21) and 
the M13 reverse (M13RP1) primers (Adams et al. Nature 368:474 (1994)). Dye 

15 terminator sequencing reactions are carried out on the lambda templates on a 
Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction 
Dye Terminator Gycle Sequencing kits. T7 and SP6 primers are used to sequence 
the ends of the inserts from the Lambda GEM- 12 library and T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. 

20 Sequencing reactions are performed by eight individuals using an average of 
fourteen AB 373 DNA Sequencers per day. All sequencing reacdons are analyzed 
using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read 
distance. The overall sequencing success rate very approximately is about 85% for 
Ml 3-21 and M13RP1 sequences and 65% for dye-terminator reactions. The 

25 average usable read length is 485 bp for M13-21 sequences, 445bp for MI3RP1 
sequences, and 375 bp for dye-terminator reactions. 

Richards et aL, Ghapter 28 in AUTOMATED DNA SEQUENGING AND 
ANALYSIS, M. D. Adams, G. Fields, J. C. Venter, Eds., Academic Press, 
London, (1994) described the value of using sequence from both ends of 

30 sequencing templates to facilitate ordering of contigs in shotgun assembly projects 
of lambda and cosmid clones. We balance the desirability of both-end sequencing 
(including the reduced cost of lower total number of ten^)lates) against shorter 
read-lengths for sequencing reactions performed with the M13RP1 (reverse) primer 
compared to the Ml 3-21 (forward) primer. Approximately one-half of the 

35 templates are sequenced from both ends. Random reverse sequencing reactions arc 
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done based on successful forward sequencing reactions. Some M13RP1 
sequences are obtained in a semi-directed fashion: M13-21: sequences pointing 
outward at the ends of contigs are chosen for M13RPI sequencing in an effort to 
specifically order contigs. 

5 

4. Protocol for Automated Cycle Sequencing 
The sequencing is carried out using ABI Catalyst robots and AB 373 
Automated DNA Sequencers. The Catalyst robot is a publicly available 
sophisticated pipetting and temperature control robot which has been developed 
10 specifically for DNA sequencing reactions. The Catalyst combines pre-aliquoted 
templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the 
thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and 
reaction buffer. Reaction mixes and templates are combined in the wells of an 
aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear 
15 amplification (i.e.., one primer synthesis) steps are performed including 
denaturation* annealing of primer and template, and extension; Le., DNA 
synthesis. A heated lid with rubber gaskets on the thenmocycling plate prevents 
evaporation without the need for an oil overlay. 

Two sequencing protocols are used: one for dye-labelled primers and a 
20 second for dye-labelled dideoxy chain terminators. The shotgun sequencing 
involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, 
permitting the four individual reactions to be combined into one lane of the 373 
DNA Sequencer for electrophoresis, detection, and base-calling. ABI currently 
25 supplies pre-mixed reaction mixes in bulk packages containing all the necessary 
non-template reagents for sequencing. Sequencing can be done with both plasmid 
and PCR- generated templates with both dye-primers and dye- terminators with 
approximately equal fidelity, although plasmid templates generally give longer 
usable sequences. 

30 Thirty-two reactions are loaded per AB373 Sequencer each day, for a total 

of 960 samples. Electrophoresis is run overnight following the manufacturer's 
protocols, and the data is collected for twelve hours. Following electrophoresis 
and fluorescence detection, the ABI 373 performs automatic lane tracking and base- 
calling. The lane-tracking is confirmed visually. Each sequence electropherogram 

35 (or fluorescence lane trace) is inspected visually and assessed for quality. Trailing 
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sequences of low quality are removed and the sequence itself is loaded via software 
to a Sybase database (archived daily to Smm tape). Leading vector poly linker 
sequence is removed automatically by a software program. Average edited lengths 
of sequences from the standard ABI 373 are around 400 bp and depend mostly on 
5 the quality of the template used for the sequencing reaction. ABI 373 Sequencers 
converted to Stretch Liners provide a longer electrophoresis path prior to 
fluorescence detection and increase the average number of usable bases to SOO-600 
bp. 

10 INFORMATICS 

1* Data Management 

A number of information management systems for a large-scale sequencing 
lab have been developed. (For review see, for instance, Kerlavage et ai, 
Proceedings of the Twenty-Sixth Annual Hawaii International Conference on 

15 System Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) 
The system used to collect and assemble the sequence data was developed using the 
Sybase relational database management system and was designed to automate data 
flow wherever possible and to reduce user error. The database stores and 
correlates all information collected during the entire operation from template 

20 preparation to final analysis of the genome. Because the raw output of the ABI 373 
Sequencers was based on a Macintosh platform and the data management system 
chosen was based on a Unix platform, it was necessary to design and implement a 
variety of multi- user, client-server applications which allow the raw data as well as 
analysis results to flow seamlessly into the database with a minimum of user effort. 

25 

2. Assembly 

An assembly engine (TIGR Assembler) developed for the rapid and 
accurate assembly of thousands of sequence fragments was employed to generate 
contigs. The TIGR assembler simultaneously clusters and assembles fragments of 

30 the genome. In order to obtain the speed necessary to assemble more than 10* 
fragments, the algorithm builds a hash table of 12 bp oligonucleotide subsequences 
to generate a list of potential sequence fragment overlaps. The nimibcr of potential 
overiaps for each fragment determines which fragments are likely to fall into 
repetitive elements. Beginning with a single seed sequence fragment, TIGR 

35 Assembler extends the. current contig by attempting to add the best matching 
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fragment based on oligonucleotide content. The contig and candidate fragment are 
aligned using a modified version of the Smith- Waterman algorithm which provides 
for optimal gapped alignments (Waterman, M. S., Methods in Enzymology 
164:765 (1988)). The contig is extended by the fragment only if strict criteria for 
5 the quality of the match are met. The match criteria include the minimum length of 
overlap, the maximum length of an unmatched end. and the minimum percentage 
match. These criteria are automatically lowered by the algorithm in regions of 
minimal coverage and raised in regions with a possible repetitive element. The 
number of potential overlaps for each fragment determines which fragments are 

10 likely to fall into repetitive elements. Fragments representing the boundaries of 
repetitive elements and potentially chimeric fragments arc often rejected based on 
partial mismatches at the ends of alignments and excluded from the current contig. 
TIGR Assembler is designed to take advantage of clone size information coupled 
with sequencing from both ends of each template. It enforces the constraint that 

15 sequence fragments from two ends of the same template point toward one another 
in the contig and are located within a certain range of base pairs (definable for each 
clone based on the known clone size range for a given library). 

The process resulted in 391 contigs as represented by SEQ ID NOs: 1-391. 

20 3. Identifying Genes 

The predicted coding regions of the Streptococcus pneumoniae genome 
were initially defined with the program GeneMark, which finds ORFs using a 
probabilistic classification technique. The predicted coding region .sequences were 
used in searches against a database of all nucleotide sequences from GenBank 

25 (October, 1997), using the BLASTN search method to identify overlaps of 50 or 
more nucleotides with at least a 95% identity. Those ORFs with nucleotide 
sequence matches are shown in Table 1. The ORFs without such matches were 
translated to protein sequences and compared to a non-redundant database of 
known proteins generated by combining the Swiss-prot, PIR and GenPept 

30 databases. ORFs that matched a database protein with BLASTP probability less 
than or equal to 0.01 are shown in Table 2. The table also lists assigned functions 
based on the closest match in the databases. ORFs that did not match protein or 
nucleotide sequences in the databases at these levels are shown in Table 3. 
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ILLUSTRATIVE APPLICATIONS 

1. Production of an Antibody to a Streptococcus pneumoniae 
Protein 

Substantially pure protein or poJypeptide is isolated from the transfected or 
5 transformed cells using any one of the methods known in the art. The protein can 
also be produced in a recombinant prokaryotic expression system, such as E. coli, 
or can be chemically synthesized. Concenuation of protein in the final preparadon 
is adjusted, for example, by concentration on an Amicon filter device, to the level 
of a few micrograms/ml. Monoclonal or polyclonal antibody to the protein can 
10 then be prepared as follows. 

2. Monoclonal Antibody Production by Hybridoma Fusion 
Monoclonal antibody to epitopes of any of the peptides identified and 

isolated as described can be prepared from murine hybridomas according to the 
15 classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or 
modifications of the methods thereof. Briefly, a mouse is lepetidvely inoculated 
with a few micrograms of the selected protein over a period of a few weeks. The 
mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 
The spleen cells are fused by means of polyethylene glycol with mouse myeloma 
20 cells, and the excess unfiised cells destroyed by growth of the system on selective 
media comprising aminopterin (HAT media). The successfully fused cells are 
diluted and aliquots of the dilution placed in wells of a microtiter plate where 
growth of the culture is continued. Antibody-producing clones are identified by 
detection of antibody in the supernatant fluid of the wells by immunoassay 
25 procedures, such as ELISA, as originally described by Engvall, E., Meth, 
EnzymoL 70:419 (1980), and modified methods thereof. Selected positive clones 
can be expanded and their monoclonal andbody product harvested for use. Detailed 
procedures for monoclonal antibody production are described in Davis. L. et aL, 
Basic Methods in Molecular Biology. Elsevier. New York. Section 21-2 (1989). 

30 
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3. Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a 
single protein can be prepared by immunizing suitable animals with the expressed 
protein described above, which can be unmodified or modified to enhance 

5 immunogenicity. Effective polyclonal antibody production is affected by many 
factors related both to the antigen and the host species. For example, small 
molecules tend to be less immunogenic than others and may require the use of 
carriers and adjuvant. Also, host animals vary in response to site of inoculations 
and dose, with both inadequate or excessive doses of antigen resulting in low titer 

10 antisera. Small doses (ng level) of antigen administered at multiple intradennal 
sites appears to be most reliable. An effective immunization protocol for rabbits 
can be found in Vaitukaitis, J. et ciL, J. Clin. EndocrinoL Metab. 33:988-991 
(1971). 

Booster injections can be given at regular intervals, and antiserum harvested 

15 when antibody titer thereof, as determined semi-quantitatively, for example, by 
double immunodiffusion in agar against known concentrations of the antigen, 
begins to fall. See, for example, Ouchterlony. O. et ai. Chap. 19 in: Handbook of 
Experimental Immunology, Wier, D., ed, Blackwell (1973). Plateau concentration 
of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M). 

20 Affinity of the antisera for the antigen is determined by preparing competitive 
binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of 
Clinical Immunology, second edition. Rose and Friedman, eds., Amer. Soc. For 
Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in 

25 quantitative immunoassays which detemiine concentrations of antigen-bearing 
substances in biological samples: they are also used semi- quantitatively or 
qualitatively to identify the presence of antigen in a biological sample. In addition, 
antibodies are useful in various animal models of pneumococcal disease as a means 
of evaluating the protein used to make the antibody as a potential vaccine target or 

30 as a means of evaluating the antibody as a potential immunotherapeutic or 
immunoprophylactic reagent. 
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4. Preparation of PGR Primers and Amplification of DNA 

Various fragments of the Streptococcus pneumoniae genome, such as those 
of Tables 1-3 and SEQ ID NOS: 1-391 can be used, in accordance with the present 
invention, to prepare PGR primers for a variety of uses. The PGR primers are 
5 preferably at least 15 bases, and more preferably at least 18 bases in length. When 
selecting a primer sequence, it is preferred that the primer pairs have approximately 
the same G/C ratio, so that melting temperatures are approximately the same. The 
PGR primers and amplified DNA of this Example find use in the Examples that 
follow. 

10 

5, Gene expression from DNA Sequences Corresponding to 

ORFs 

A fragment of the Streptococcus pneumoniae genome provided in Tables 1- 
3 is introduced into an expression vector using conventional technology. 

15 Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well 
known in the art. Gommercially available vectors and expression systems are 
available from a variety of suppliers including Stratagene (La Jolla, Galifomia), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, Galifomia). If 

20 desired, to enhance expression and facilitate proper protein folding, the codon 
context and codon pairing of the sequence may be optimized for the particular 
expression organism, as explained by Hatfield et ai, U. S. Patent No. 5,082,767, 
incorporated herein by this reference. 
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The following is provided as one exemplary method to generate 
polypeptide(s) from cloned ORFs of the Streptococcus pneumoniae genome 
fragment. Bacterial ORFs generally lack a poly A addition signal. The addition 
signal sequence can be added to the construct by, for example, splicing out the poly 
5 A addition sequence from pSG5 (Stratagene) using Bgll and Sail restriction 
endonuciease enzymes and incorporating it into the mammalian expression vector 
pXTl (Stratagene> for use in eukaryotic expression systems. pXTl contains the 
LTRs and a portion of the gag gene of Moloney Murine Leukemia Virus. The 
positions of the LTRs in the construct allow efficient stable transfection. The 

10 vector includes the Herpes Simplex thymidine kinase promoter and the selectable 
neomycin gene. The Streptococcus pneumoniae DNA is obtained by PCR from the 
bacterial vector using oligonucleotide primers complementary to the Streptococcus 
pneumoniae DNA and containing restriction endonuciease sequences for PstI 
incorporated into the 5' primer and Bglll at the 5' end of the corresponding 

15 Streptococcus pneumoniae DNA 3' primer, taking care to ensure that the 
Streptococcus pneumoniae DNA is positioned such that its followed with the poly 
A addition sequence. The purified fragment obtained from the resulting PCR 
reaction is digested with PstI, blunt ended with an exonucleasc. digested with 
Bglll. purified and ligated to pXTl, now containing a poly A addition sequence 

20 and digested BgllL 

The ligated product is transfected into mouse NIH 3T3 cells using 
Lipofectin (Life Technologies, Inc.. Grand Island, New York) under conditions 
outlined in the product specification. Positive iransfectants are selected after 
growing the transfected cells in 600 ug/ml G418 (Sigma. St. Louis, Missouri). 

25 The protein is preferably released into the supernatant. However if the protein has 
membrane binding domains, the protein may additionally be retained within the cell 
or expression may be restricted to the cell surface. Since it may be necessary to 
purify and locate the transfected product, synthetic 15-mer peptides synthesized 
from the predicted Streptococcus pneumoniae DNA sequence are injected into mice 

30 to generate antibody to the polypeptide encoded by the Streptococcus pneumoniae 
DNA. 
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Alternatively and if antibody production is not possible, the Streptococcus 
pneumoniae DNA sequence is additionally incorporated into eukaryotic expression 
vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease 
5 cleavage sites arc engineered between the globin moiety and the polypeptide 
encoded by the Streptococcus pneumoniae DNA so that the latter may be freed 
from the formed by simple protease digestion. One useful expression vector for 
generating globin chimerics is pSG5 (Stratagene). This vector encodes a rabbit 
globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 

10 transcript, and the polyadenylation signal incorporated into the construct increases 
the level of expression. These techniques are well known to those skilled in the art 
of molecular biology. Standard methods are published in methods texts such as 
Davis et a/., cited elsewhere herein, and many of the methods are available from the 
technical assistance representatives from Stratagene, Life Technologies, Inc., or 

15 Promega. Polypeptides of the invention also may be produced using in vitro 
translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes 
of clarity and understanding, one skilled in the art will appreciate that various 
changes in form and detail can be made without departing from the true scope of 

20 the invention. 

All patents, patent applications and publications referred to above are 
hereby incorporated by reference. 
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(1) GENERAL INFORMATION: 

(i) APPLICANT: Charles Kunsch 

Gil H. Choi 
Patrick S. Dillon 
Craig A. Rosen 
Steven C. Barash 
Michael R. Fannon 
Brian A. Dougherty 

(ii) TITLE OF INVENTION: Streptococcus pneumoniae Polynucleotides and Sequences 

{iii> NUMBER OF SEQUENCES: 391 

(iv) CORRESPONDENCE ADDRESS: 

{A) ADDRESSEE: Human Genome Sciences. Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: RocJcville 

(D) STATE: Maryland 

(E) COUNTRY: USA 

(F) ZIP: 20650 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette. 3.50 inch. 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 



(vi) CURRENT APPLICATION DATA: 
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(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Brookes* A. Anders 

(B) REGISTRATION NUMBER: 36,373 

(C) REFERENCE/DOCKET NUMBER: PB340P1 

(vi) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (301) 309-d504 

(B) TELEFAX: <301) 309-8512 
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(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5625 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

CCAAGCAAAA CCAGCTACAG CTAAAGGAAC TTACGTAACA AACTTGACTA TCACAACTAC 60 

TCAAGGTGTT GGTATCAAAG TTGAC6TAAA CTCACTTTAA TCAGTAGTTA AAGTAATGTA 120 

AAAAAGTTGA AGACGCTATG TCTCAACTTT TTTTGATGTA CGACGGGCAT GTTGTATAGT 180 

AGATGTGTAC TATTCTAGTT TCAATCTACT ATAGTAGCTC AGAAGTCGGT ACTTAAACGT 240 

GCTATATCAA AACCAGTCCT TGAAAAACGT GGACTGGTTT CGTGTTTGGA TTATTACCTT 300 

GAACGACATG CGTTAAAAGT TAGTTGAACC GCCGTATGCC GAACGGACGT ACGGTGGTGT 360 

GAGAGGGGCT AGAGATTATC CCCTACTCGA TTTCGAAATC TAGTGGAATG AATCTGGAAT 420 

AGTCCATCGA GCTTTCTAAT ACTCTTCGAA AATCTCTTCA AACCACGTCA ACGTCGCCTT 480 

GCCGTGCGTA TGGTTACTGA CTTCGTCAGT TCTATCCACA ACCTCAAAAC AGTGTTTTGA 540 

GCTGACTACG TCAGTTCCAT CTACAACCTC AAAACAGTGT TTTGAGCAAC CTGCGGCTAG 600 

TTTCCTAGTT TGCTCTTTGG TTTTCATTGA GTATAACACA TTGTTAGAAG TTGGTTTAAA 660 

TTTCCTAATC AGTTTGTTCA CATTTACCTT CGATATATTA TATCCCATAG TTAAGGTTGG 720 

TCATACAGAT GATTATAGTC ATGGAGCCGT AAAACTTAGT GTTTCTTTAG TTGACAAAGA 780 

TGCCATGAAA AAAATATTTG TAACTGTAAT AGGATATTTT GAAATAAATA TAGATGAAAA 640 

TATCACCGAT ATTCTATACG TAAATGGTAC TGCTATTCTT TATCTTTATT TACGTTCAAT 900 

TGTTTCAATA GTTTCGGCAA TTGATAGCAG TGAAGCAATG TTGCTACCTA TCATTAATGT 960 

TTTAGAGTTA CTAGATAAAT CTCAACCTTT TGAAGAAGAA TAATTTATTA GCTCACTA/iA 1020 

TTGAGG6TAA GGAAAAGTAA AAGCAGTAAG AAAAATGTCT TGCATTATAC AGCAACCTTT 1080 

TGGGAATGAG TG6ATGGATT GAATAAAATT TGATTAAGAG TGGATGATTT ATCTGTAGAT 1140 

TATTATTGGA CAGTTAGTCT TGAAGTAGTC TAAGAATTA6 GTTATAATCA GTAGAAGCCT 1200 

TGCTAATAAT GA66AGGTTA GTTTATGTAT AGTAGACTGA ATCTAAAATA GTACGAAACA 1260 

ATTGCTAAAA CATTTATAGA AATTAATTTT ACTTTCCCAA TCGATTTGTT CTCATCTTAT 1320 

TTCAATCCGC TATATATTAT GGTATCGAAT CTTCATCAGA ATGATAAAAT TAATCAATTG 1380 

ATATCTOATT ACAAACAGAA TATGAAAGCT TTTTATATCA CTATTGAAAA ATTTATACGA 1440 
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GATGATGAAA GCCTTAAGTG TTATTTTATA AAGGTTATTT CAAGTCGTTC CAAGGTAACA 1500 

AGTCTAGATC AOATTGAAGC TGATAAAACG ATACAAAGAA AATATTCAAG TGAGCTAAAA 1560 

AAATTTATTG GATTTTATAA TGAGATTATT TGTGAGGAAA ATAGTTTCCT ACATGTACGA 1620 

AAGAGGTGGT CGAGTTGGTT TAGGTAGTCG ATGCGTGAGT TGATAATTCT CAGGGTATGG 1680 

ACTTCTTTTT CATGAATGAG GTAAAAGAGC AGGTATTGTT TAGAGACAAT CATTCTGAGC 1740 

ATATTTTCTG GATAGAGGGA GTATCCGATT TTATGATCAA AGTTAATACC GCCCTCTGGT 1800 

GAGAAGATGA GTAGGTTGGT AATTTAAACT ATTAAACAGA ATTTTTGATT AAAAGTATTA 1860 

TTTCATGAGA GAAATCCTAA TTTCACAATC CATAGGCAAA CGCTTGCATT TCGTTTTTTA 1920 

TTGGACTATA ATAGGTTGGT ATAAAGCCTT CTGTAGTAAT AAAATGTAGA AGGTGTAGAA 1980 

AGTAAGGATT TAGAATATTT GTAGTTAAAA ACACAATGTT GCTATTCCTT ACGATAGGGA 2040 

GATAGATATG GCAATGATAG AAGTGGAACA TCTTCAGAAA AATTTTGTGA AGACTGTTAA 2100 

GGAACCGGGC TTGAAGGGGG CTTTGCGCTC CTTTATTCAT CCTGAAAAGC AGACCTTTGA 2160 

AGC6GTCAAG GATTTGACCT TT6AG6TTCC AAAAGGGCAG ATTTTAGGAT TTATCGGGGC 2220 

AAATGGTGCT GGGAAGTCGA CAACCATTAA AATGCTGACA GGAATTTTGA AACCAACATC 2280 

TGGTTTTTGT CGGATTAACG GCAAGATTCC CCAGGACAAT CGGCAAGATT ATGTCAAAGA 2340 

TATTGGCGTA GTCTTTGGAC AACGCACCCA GCTATGGTGG GATTTGGCTC TGCAAGAGAC 2400 

CTACACTGTC TTAAAAGAGA TTTATGATGT GCCAGACTCG CTCTTTCATA AGCGTATGGA 2460 

CTTTTTGAAT GAAGTCTTGG ATTTGAAGGA CTTTATCAAG GATCCCGTGC GGACTCTTTC 2520 

ACTGGGACAA CGGATGCGGG CGGATATTGC GGCCTCCTTG CTCCACAATC CCAAGGTTCT 2580 

TTTTTTAGAT GAGCCGACCA TTGGTTTGGA CGTTTCGGTT AAGGATAATA TTCGTCGGGC 2640 

AATTACTCAG ATCAATCAAG AGGAAGAAAC TACCATTCTT TTGACCACTC ACGATTTGAG 2700 

TGATATTGAG CAACTTTGTG ATCGGATTTT CATGATTGAC AAGGGGCAAG AGATTTTTGA 2760 

TGGAACGGTG AGCCAACTCA AGGAGACCTT TGGTAAGATG AAGACTCTCT CTTTTGAACT 2820 

GCTACCAGGT CAAAGTCATC TCGTCTCTCA CTATGACGGT CTGTCTGATA TGACCATTGA 2880 

TAGACAAGGA AACAGCCTCA ACATTGAATT TGATAGTTCT CGCTACCAGT CAGCTGACAT 2940 

TATCAAGCAA ACCCTGTCTG ATTTTGAAAT CCGCGATTTG AAGATGGTGG ATACGGATAT 3000 

TGAGGATATT ATCCGTCGCT TCTACCGAAA GGAGCTCTAG GATGATGAAA TTGTGGAGAC 3060 

GTTATAAACC CTTTATCAAT GCAGGGGTTC AGGAGTTGAT TACTTACCGA GTCAACTTTA 3120 

TTCTCTATCG GATTGGCGAT GTCATGGGGG CTTTTGTGGC CTTTTATCTC TGGAAGGCTG 3180 
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TCTTTGATTC TTCGCAAGAG 


TCTTTGATTC 


AGGGCTTCAG 


TATGGCGGAT 


ATCACCCTCT 


3240 


ACATCATCAT GAGTTTTGTG ACCAATCTTC TGACTAGATC CGATTCGTCC TTTATGATTC 


33O0 


GGGAGGAGGT CAAGGATGGC 


TCCATTATCA 


TGCGTTTGTT 


GCGACCAGTG 


CATTTTGCGG 


3360 


CCTCCTATCT TTTCACCGAG 


CTTGGTTCCA 


AGTGGTTGAT 


TTTTATCAGC 


GTTGGCCTTC 


3420 


CATTTTTAAG TGTCATTGTC 


TTGATGAAAA 


TCATATCGGG 


TCAAGGTATT 


GTAGAGGTGC 


3480 


TAGGATTAAC TGTCATTTAT 


CTTTTTAGCT 


TAACGCTCGC CTATCTGATT AACTTTTTCT 


3540 


TTAATATTTG CTTTGGATTT TCAGCCTTTG TGTTTAAAAA TCTTTGGGGT 


TCCAACCTAC 


3600 


TTAAGACTTC CATAGTGGCT 


TTTATGTCGG 


OGAGTTTGAT 


TCCCTTGGCA 


TTTTTTCCAA 


3660 


AGGTTGTTTC AGATATTCTC 


TCCTTTTTGC 


CTTTTTCATC 


CTTGATTTAT 


ACTCCAGTTA 


3720 


TGATCATTGT TGGAAAATAC 


GATGCCAGTC 


AGATTCTTCA 


GGCACTCCTT 


TTGCAGTTCT 


3790 


TCTGGCTCTT AGTGATGGTG 


GGATTGTCTC 


AGTTAATTTG 


GAAACGGGTC 


CAGTCCTTTA 


3840 


TCACCATTCA AGGAGGTTAG 


TATGAAAAAA 


TATCAACGAA 


TGCATCTGAT 


TTTTATCAGA 


3900 


CAATACATCA AACAAATCAT 


GGAATATAAG 


GTAGATTTTG 


TGGTTGGTGT 


CTTGGGAGTC 


3960 


TTTCTGACTC AAGGCTTGAA 


TCTCTTGTTT 


CTCAATGTCA TCTTTCAACA 


TATTCCATTC 


4020 


CTAGAAGGCT GGACCTTTCA AGAGATAGCT 


TTCATTTATG 


GATTTTCCTT 


GATTCCCAAG 


4080 


GGAATGGACC ATCTCTTTTT 


TGACAATCTC 


TGGGCACTAG 


GGCAACGCCT 


AGTCCGAAAA 


4140 


GGGGAGTTTG ACAAGTATCT 


GACTCGTCCC 


ATCAATCCTC 


TCTTTCACAT 


CCTAGTTGAA 


4200 


ACCTTTCAGA TTGATGCCTT 


GGGTGAACTC 


TTAGTCGGTG 


GTATTTTATT 


GGGAACAACA 


4260 


GTGACCAGCA TTGTTTGGAC 


TCTTCCAAAA 


TTCCTGCTTT 


TCCTAGTTTG 


TATTCCTTTT 


4320 


GCGACCTTGA TTTATACTTC 


TCTTAAAATC 


GCAACAGCCA 


GTATCGCCTT 


TTGGACTAAG 


4380 


CAGTCAGGCG CCATGATTTA 


CATCTTCTAT 


ATGTTCAATG 


ACTTTGCTAA 


GTATCCGATT 


4440 


TCTATTTACA ATTCTCTTCT 


TCGTTGGTTG 


ATTAGCTTTA 


TCGTGCCTTT 


CGCCTTTACA 


4500 


GCCTACTATC CAGCTAGCTA 


TTTCTTACAG 


GAAAAGGATG 


TGTTCTTTAA 


CGTAGGAGGT 


4560 


TTGAT6TTGA TTTCTCTGGT 




ATTTCCCTTA 


AACTTTGGGA 


TAAGGGCTTA 


4620 


GATTCCTACG AAAGTGCGGG 


TTCGTAAAAG 


CTAAAGTAAG 


ACTAAAATCA 


AGAAAGAAAC 


4680 


TTATGATGTT TGTAATTGAA 


GAAGTCAAGG 


ATGAAAATCA AAAAAAGGCA 


GTTGTCGCTG 


4740 


AGGTTTTGAA GGATTTGCCA 


GAATGGTTTG 


GAATCCCAGA AAGCACACAA 


GCCTATATAG 


4800 


AAGGAACCAC GACACTGCAA GTTTGGACCG CCTATCAGGA GAGTGATTTG ACTAGATTTG 


4860 


TAAGCTTATC CTATTCGAGT GAAGATTGTG CAGAGATTGA TTGTCTCGGC GTAAAAAAGC 


4920 


TTATCAAG6T AGAAAAATT6 GGAGCCAATT GCTTGCTACT TTAGAGAGTG AAGCTCGTAA 


4980 
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AAAAGTTGGT TATCTGCAGG TCAAAACAGT GGCAGAAGGT TCTAATAAAG ATTATGATCG 5040 

AACAAATGAC TTTTATCGAG GTCTTGGCTT TAAAAAGTTA GAGATTTTTC CTCAACTATG 5100 

GAATCCGCAA AATCCTTGTC AGATTTTGAT TAAAAAGCTT 6AATAATATT ACTTGACATC 5160 

TATTCTCAGA GTGCTATACT GTAAGTGTAA TCGCCGATTT AGCTTAGTTG GTAGAGCAAG 5220 

GCACTCGTAA AGCCTAGGTT ATAGGTAGAT AAACGACTGA GGATTTGAAA AAATAGATAG 5280 

GTAGAAGATA ACCGTTAAGC CTTACTCTTA GCGGTTATTT ATATTGTTTA ATAGCGCTAA 5340 

TATTTTATCA ATTATGCCTG TTTTCGTGTT TCTGGTAGTT GTTCAAGTTT ATTGCTACTA 5400 

TTTTTGATGG TATGAATGTG CTTATAATGT ATCCCGGTTA ACGAAAGTTT TGGACTTATA ' 5460 

CTCTTCGAAA ATCTCTTCAA ACCACGTCAA CGTCGCCTTG CCGTGCGTAT GGTTATGACT 5520 

TCGTCAGTTC TATCCACAAC CTCAAAACAG TGTTTTGAGT GACTACGTCA GTTCCATCTA 5580 

CAACCTCAAA ACACTGTTTT GCCCAATCTG CGGCTAGTTT CCTAG 5625 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7571 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CTCTCCA6CT TTCCTTGCGA GTTGGCCATG TTGTGTCTTT AAGAAGTCTA AAAATATCTC 60 

CAATAAAAC6 CATCGCTCTC TCCTATCTCG TTTCTCTGTG TGTAGTGTAC TTGCCACAAT 120 

GCTTACAAAA TTTATTTACT TCTAGTCGTG TAGGCTTGAG GTTTCCGCTG ATCTTGATTG 180 

AATAGTTTCT C6AACCACAA ACCGCACAAG CTAGGCTTGC TTTTTTTAGT GCCATAACGC 240 

CTCCATCTTA TCCATTATAA CAAGAAAGCT AGGCTTTGAC AAGCATCTTA GCGAAATAGA 300 

TTGACTATCG AATCCCATAT TGTTTGAGCC TTTTCCTT7UV TCTTCGCATC TGAGATAGCC 360 

CGGCTAGCCT CATCTACTAG ACTTTGCGCA CGCCCTCGAA TATCAGACAA ATTATCATCT 420 

GTCTGGCTAT TATCATTGGT TTGTACTTGT CTTTTTGTAT TGGCTGGTGC AATTCCATTT 480 

TGCTTATAAG CATTTTCAAC CGTAAAGGTA CTTCCTGGCG TATAAGGTAA AATGGTATTG 540 

GCAATGTTTC TAAAGACATG AGCTGCACCG TTTGAAGTAG AGCCAGCTAG ATAGTGGTTT 600 

TCATCAGTGG TCGGAAAGCC AAGCCAGTGG CTAATCACTA CATCCGGAGT ATAACCAATT 660 

ACCCACTGGT CACTTGTCTA CTCCGGATTG AAAACTGCTT CAGTTGTTCC AGTTTTCCCT 720 
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GCCATGACAT AGTCTGCAGG CGATGAACTA ATACCGGTAC CGTTGGTGAA AGTCCCCAAC 780 

ATCATACTGG TCATCTTGTC AGCTACAGAC TTATCAATCA CCCGTTTTTG TGAATTTTTA 840 

TGACTCGCAA TAACTTGTCC ACTAGCATTT TCAATTCTAC TAATAAAATG AGCTTCAGGC 900 

ATTAAACCTT CATTTGCAAA GGCGGCGTAT GCTTGAGCCA TTTGAAGAGG GTTGGTTTCA 960 

ACACCGCTTC CCAAGGCGAC ACCAAGAACA CGGTCGACCT TTTCCATGTT GAGTCCGAAT 1020 

TTTTCGCCTG CCTCAAAAGC CTTGTCGACA CCCAAATCAT TAACAGTGGC AACAGCAGGT 1080 

AGATTAAGCG ATTCTGCCAA GGCTTGATAC ATAGGAACTT CTCGACTCGT TTTGATCCCT 1140 

GCATAGTTAT CAACCTTATA GCTGTCATAC TGCATGGTAT GGTTATCCAA CTGCTTATTC 1200 

AAAGCCCAGC TTGCTTCAAC TGCTGGCGTA TAAACAACTA AAGGCTTAAT TGTAGAACCA 1260 

GGACTACGCT TTGATTGGGT TGCATAGTTG AAATTCCGGA ATCCAGTTTT ATCATTGTCA 1320 

GCAACTTGAC CGACAACTCC ACGAACTCCC CCTGTTTTCG GTTCGAGGGC TACACTTCCT 1380 

GATTGAGCAA ACGTTCCATC CTCTGCCCTC GGAAATAGCG ATGTGTTTTC ATAAACAATC 1440 

TGCATATTTG CTTGGTAGTT TTGGTCCAGC TCTGTGTAAA TGCGGTAGCC ATTATTGACA 1500 

ATCTCTTCCT CTGTTAGATT ATACTTGGAA ACAGCTTCAT TAACCACCGC ATCAAAATAA 1560 

GAG6GGTAAC GGTAATCTGA GATTTTTCCT TCATACTTAT CGTGCAATTG CGAAGTCATA 1620 

TCAACTTCAG CAGCTTTGGT TTCTTGGTTT TTATCAATAT ATCCTGCTGC AACCATATTC 1680 

TGCAAGACAG TATCGCGCCG ATTAGTAGAA TCTTCTACGG AATTCAA6GG ATTATACAGT 1740 

TCCGGCCCCT TGAGCATCCC TGCCAGAGTC GCAGCTTGAT CCAGACTCAC TTCTGATGCA 1800 

GAAACTCCAA AGTATTTCTT ACTCGCATCT TCTACACCCC ACACACCATT TCCAAAATAA 1860 

GCGTTGTTAA GGTACATGGT TAGAATTTGC TCCTTACTAT ATTTTTTGCT TAATTCTAAG 1920 

GCAAGGAAAA ATTCTTTCGC TTTTCTCTCA ACAGTTTGAT CCTGCGATAA ATAGGCGTTT 1980 

TTAGCCAGCT GTTGGGTAAT GGTAGAGCCA CCACCTGAAC GTCCAGCAGT GACAATAGCC 2040 

AAGAAAAAAC GGCCATAGTT AATCCCGTCA TTTTTATAGA AAGAACGGTC TTCTGTCGCA 2100 

ATAACAGCAT TCTGCAAGTT TTTACTGATG TCAGTCAGCT CAACATAGGT TCCCTTTTGA 2160 

CCAGACAAGG CACCAGCCTC TTTTTCTTCA CGGTCAAAAA TAAGAGTCCG AGTTTTCAAG 2220 

GCATTTTGCA AATCATTGAC ATTGGTCGAC TTGGCTACAG CAAACAAATA GATTCCAACT 2280 

AGCAAGCCTG CACTCAAACC TAGTATAAGG ATAATCTTTG TTAGATGATA ACGACGCCAG 2340 

AATTTTCGAA TCGGACCTAC TTGGGCTAAT TTTTTTCGAT CACTACGAGA GCGACGTAAG 2400 

ATAGTAGAAT CAGAGTCCTC TAGTTCACTT GTTTCTTTTT TAAAAAGAGA AAGAAATTTC 2460 

TCAAATAATT TATCTAATTT CATGCGTTTA TTTTATCATC TTCATCATAG GAAGACAAGA 2520 
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ATTTAGCTAT 


TTCXTTATCCA AATAGGGCTT TTTTTGTTAC AATATCTGTA TGCAATTCAC 


2580 


ATTTACATTA 


CCCGCCTCTC TACCTCAAAT GACAGTAAAG CAATTACTTG AGGAACAACT 


2640 


CCTCATCCCT 


AGAAAAATCC GTCATTTTTT GAGAATCAAG AAACATATTT TGATAAATCA 


2700 


AGAAGAAGTC 


CACTGGAAGG AAATC6TAAA TCCTGGAGAT GTTTGCCAGT TGACTTTTGA 


2760 


CGAGGAAGAT 


TATTCCCAAA AGACGATCCC TTGGGGCAAC CCAGACTTAG TGCAGGAAGT 


2620 


TTATCAAGAT 


CAACACTTGA TTATTGTAAA CAAACCAGAG GGGATGAAAA CGCATGGTAA 


2880 


TCAACCAAAC 


GAAATTGCCC TTCTTAACCA TGTCAGTACC TATGTTGGCC AAACCTGCTA 


2940 


TGTCGTTCAT 


CGTCTGGACA TGGAAACCAG TGGCTTAGTT CTCTTTGCCA AAAATCCTTT 


3000 


TATCCTGCCC ATTCTCAATC GCTTATTGGA GAAAAAAGA6 ATTTCTAGAG AATATTGGGC 


3060 


TCTAGTTGAT GGAAATATCA ACAGAAAAGA ACTTGTTTTC AGA6ACAAAA TTGGACGTGA 


3120 


TCGCCATGAT 


CGTAGAAAAA GAATAGTTGA TGCAAAAAAT GGGCAATATG CTGAAACGCA 


3180 


TGTAAGCAGA TTAAAGCAAT TCTCAAACAA GACTTCCTTG GCTCATTGCA AGCTAAAGAC 


3240 


A6GGCGAACC CATCAGATTC GTGTGCACCT TTCGCATCAT AATCTTCCTA TCCTGGGAGA 


3300 


CCCTCTCTAT 


AATAGTAAAT CAAAGACAAG CCGGCTTATG CTTCATGCCT TCCGACTTTC 


3360 


CTTTACCCAC 


CCACTTACTT TAGAGAAGCT AACTTTCACT ACCCTTTCAA ATACATTTGA 


3420 


AAAAGAATTA 


AAAAAGAATG GATGATCGTG TCATCCATTT TTCCATATAA AAAAGCAAGA 


3480 


CCACAAAGCC 


TTGCTTTCTA TCAACTCAAG AATTATTTAG CAATTTTTGC GAAGTATTCA 


3540 


AGAGTACGAA 


CAAGTTGTGC AGTGTATGAC ATTTCGTTGT CGTACCATGA TACAACTTTA 


3600 


ACCAATTGTT 


TACCGTCAAC GTCAAGAACT TTAGTTTGAG TTCCGTCAAA CAATGAACCG 


3660 


TAAGACATAC 


CTAC6ATATC TGAAGATACG ATTGGATCTT CTGTGTAACC GTATGATTCG 


3720 


TTTGAAGCT6 


CTTTCATAGC TGCGTTCACT TCATCAACAG TAACGTTCTT TTCAAGAACT 


3780 


GCTACCAATT 


CAGTAACTGA TCCAGTTGGA GTTGGAACGC GTTGTGCAGA TCCGTCAAGT 


^Oit n 
4o4U 


TTACCATTCA 


ATTCTGGGAT TACAAGACCG ATAGCTTTTG CAGCACCAGT TGAGTTAGGA 


3900 


ACGATGTTTG 


CAGCACCAGC GCGAGCACGG CGAAGGTCAC CACCACGGTG TGGTCCGTCA 


3960 


AGGATCATTT 


GGTCACCAGT GTAAGCGTGG ATAGTAG^'CA TCAATCCTTC AACAACACCA 


4020 


AAGTTGTCTT 


GAAGAGCTTT AGCCATTGGA GCCAAGCAGT TTGTAGTACA TGAAGCACCT 


4080 


GAGATAACTG TTTCAGTACC GTCAAGAACG TCX3TGGTTAG TGTTGAATAC AACTGTTTTA 


4140 


ACGTCGTTTC 


CACCAGGAGC AGTGATAACA ACTTTTTTAG CTCCACCTTT AAGGTGTTTT 


4200 


TCAGCTGCTT CTTTCTTAGC AAAGAAACCA GTAGCTTCAA GAACGATTTC TACACXX5TCA 


4260 
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6TAGCCCAGT CGATTTGTTC TGGATCACGT TCAGCAGAAA CTTTGATGAA TTTACCGTTA 4320 

ACTTCAAATC CACCTTCTTT AACTTCAACA GTACCGTCGA AACGACCTTG AGTTGTGTCG 4380 

TATTTCAACA AGTGTGCAAG CATAACTGGA TCTGTAAGGT CGTTGATGCG TGTAACTTCA 4440 

ACACCTTCTA CGTTTTGGAT ACGACGGAAA GCAAGACGAC CGATACGTCC GAAACCGTTA 4500 

ATACCAACTT TAACTACCAT TAGTGATTTC CTCCTTATGA AAATCATGAA ATTTTTATTG 4 560 

TGAAAAGAGT AACTTGAATC ACTACAAATC ACCTTTCAAC AAACCTATTA TACAACTATT 4520 

TGAGTTGAAT TGCAAGTATG GCCATTGTTT TTCTATGTTA GTT T CT T TTT AAGACTGTAA 4680 

ACCAAGGAAT CCCTTACTAT TCATAGCATA ACGATTCTAT AGGATCCATT TTACTAATCT 4740 

TACGCGCCGG GAAGTAGGCT GAGACATAAC CAAGTAATAG AGCGAAAACT AGAGTTCCTA 4800 

AAACAGATAA AAGATTTAAT TTAAAAACCT TAGTGATGGA TGGGTAAAAG TGACTTACAA 4860 

TCGCATTCGC CAAACTTCCC ACCCCTTGTG CAACCAAAAA TGCCAGCAGC AAGGCGATGC 4920 

CTACAATCCA GATAGCCTCG TAAATAAAAA TTCCTTTGAC ATCACGATTC TGATAACCAA 4980 

CTGCTTTCAT GACACCTATT TCCTTGGAAC GTTGCATGAT ATTGATGTAA ATAATGATAC 5040 

CAATCATAAC CGCTGCTACC ACAATAGCTT GTGATGAAAG CACAATCAAT AATCCCTGAA 5100 

TAACACGAAT AAAGGTAATC ACAATATCAA GAACTCTCTG TTGAGAAAGC ACAGTATACT 5160 

TCTTATTTTT CTGTAATTCT TCTGTTACTA CTTTTGTCTG TGATGGATCT TTGAGTTCCA 5220 

AGATAAAATA AGATACAGCT TTCOTAAATC CAGCCTCTTT CAAAATCGTT TCCATTTGAT 5280 

6AGACAGCAT GAAACTGTTG CTGTCCTCCA TGTCATCTTC ATCATTGATT ACACGTACAA 5340 

TCTTCGTTTG AAATTGAGCA ATCTTACTAG TTTCGGCAGC ACTTTCTACA ATGCTGGCTG 5400 

AGACTGATTT GCCAATAAGA TCATTAGCTG TCAAATTTTT TCCTGTCTGT TCATTCCAAT 5460 

TTTTTAGTAA ACTGCTTGGA ATCGTTAATC CCTGTTCATT TGTATCAGTA TAGAGGGATC 5520 

CAGCCAACAC TTTGTCCGTC TCATTATTAC TAACAGAGAT ACTTGTATCA TCATAAAGAC 5580 

TCACTACTTG AGCATAAGAA GGCATCGTTT GACTCAGATC CATTTCTTGC CCATCTATAG 5640 

TAATATTTGA CATGTTCATC CCAAAAGGAC TCTCCAAATA TTTAATAGCT TCTTTCCCAA 5700 

CTGTATCCGT GATATATAGT CAATTGAAAC AAGAGCAGGA TAAAAAAGCC TCGTAAAAGG 5760 

TATTGCAACT TGGTAATACC TTTTTGAGGT GCTTTTTGAT AT6AGCCCAT GTTTTCTCAA 5820 

TAGGATTGTA CTCAGGCGAG TAGGGAGGAA GAGGTAAAAG TTTATGCCCA AACTCTTCGC 5880 

ATAAAAGTTC TAGCTTCCCC ATTCTATGGA ATCTTACATT ATCCATAATA ATAACCGATG 5940 

GTGTGTTTAA TGTTGGTAAG AGAAAATXCT GAAACCAAGC TTCAAAAAAG TCGCTCGTCA 6000 

TCGTCTCTTC GTAAGTCATT GGA6CGATTA ATTCACCATT TGTTAGACCT GCAACCAAAG 6060 
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AAATCCTCTG 


ATATCTTCTT 


CCAGATACTT 


TGCCTCTTAT 


TAATTGACCT 


TTTAATGAGC 


6120 


GACCATATTC 


TCGATAAAAA 


TAAGTATCGA 


ATCCTGTTTC 


GTCAATCTAA 


ACAGGTGCTA 


6180 


GGTGCTTTAA 


ACTATTAAAA 


TTCTTAAGAA 


ATAAGGCTAC 


TTTTTCTGGG 


TCTTGTTCAT 


6240 


AGTAGGTGTG 


GTTCTTTTTT 


CGAGTGTAGC 


CCATAGCTTT 


GAGCGTATAG 


T6GAT6GTAG 


6300 


TTGGATGACA 


GCCAAATTCA 


GAAGCTATTT 


CAGTCAAATA 


AGCGTCTGGA 


TTGTCAGTAA 


6360 


GATAGTTTTT 


AAGTCTATCT 


CTATCAACCT 


TTCTTGGTTT 


TATTCCTTTT 


ACTTGGTGGT 


6420 


TTAGCTCTCC 


TGTTTTCTCT 


TTTAGCTTTA 


ACCAGCCATA 


AATGGTATTA 


CGTGAGATTT 


6480 


GGAAAACGTG 


TGATGCTTCT 


GTTATACTAC 


CTGTTCGCTC 


ACAATAAGAG 


AGAACTTTTT 


6540 


TACGAAAATC 


TATTGAATAT 


GCCATAAAAA 


GATTATACCA 


CATTGTGTAC 


TATTTTTGGT 


6600 


TCATTTTACT 


ATATTTGAAG 


AGGCGTTTAA 


ACTATCTGAC 


ATAAAACTC6 


TTCTAGAGGA 


6660 


AAGACATCCT 


TTAAAAAGTT 


AGTTTATTTT 


ACAACTTAGA 


CATCAAGGTA 


GGTTAACCCC 


6720 


TTCATGGAAA 


AATCAAGACT 


CTTAGCACTA 


TGGGTTAAAC 


TACCACTGGA 


GACGTAATCA 


6780 


ATCOCTAAAC 


CACGAAAACG 


GCTAATAGTG 


GTCATATCAA 


TATTTCCAGA 


ACATTCAATC 


6840 


CGAGAACGTC 


CTGCAATTAG 


GGTAATGGCC 


TGTTCAATCT 


GTTCCAATGA 


CATATTATCC 


6900 


AACATGATAA 


TATCAGCACC 


CGCCGCCGCA 


GCTTCTTCGG 


CAGCAGCAAG 


GCTTTCCACT 


6960 


TCCACCTCGA 


CCATTTTCAC 


AAAAGGGGCA 


TAGGCACGCG 


CTTGAGCAAT 


TGCCTTTTGA 


7020 


ACACTACCTA 


CTGCCGCAAT GTGATTGTCT 


TTTAGCAGGA TAGCATCTGA 


TAAATTAAAG 


7080 


CGATGATTAT 


AGCCACCGCC 


AACTCTCACG 


GCATATTTCT 


CAAAAAGAC6 


TAAATTAGGA 


7140 


GTAGTTTTTC 


GAGTATCAAA 


TACCTTAATG 


CAATCATCGC 


CTAAGGCTTC TACATAAGCA 


7200 


GCTGTCATCG 


AAGCAATCCC 


TGATAAATGT 


TGTAAAAAAT 


TCAAGGCAAC 


GCGTTCACAT 


7260 


GTTAAGAGAC 


TTCTCACCGA GCCTATGATT 


TCTAAAACCA AATCGCCACT 


AGTCAAACGA 


7320 


TCCCCATCCT 


TAAATTGATG 


AGGATTCTGG 


AAGGTCACCT 


CX3GCATCAAA 


TAGGGTAAAA 


7380 


ACCCTTTGAA 


AAACGGTTAG 


CCCCGCTAAA 


ACACCAGCTT 


CCTTGGCAAA 


AAGCGACACC 


7440 


TTGGCTTGGC 


CATGATGATC 


AAAAATGGCA 


TTGGTACTGT 


AATCTTCGGA 


ATGAACATCT 


7500 


TCTCGCAAGG 


CTGCTTTCAA 


TGTATCATCT 


ATTTGAAAAG 


GGGTTAAATC 


AGTTGAAATG 


7560 


ATTGACATCA 


C 










7571 



(2) INFORKATION FOR SBQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26385 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

TTTGCTAGTG GCTTAAATTC TTCAGGAAAA TCAGGCGTAT CTAAAAGTCG TGTCGTTTTT 60 

GTTTCATCTA TATAAAGACT TCCTGCTCCC CCTACAACTA GAAAACGTGT CTGTGTTCCA ' 120 

GCAAGAAGCT GATTAAATAG TTCGATTGAT TTGCTGTGGA GCGGTAGCGT ATCTGGTGTA 180 

TAAGCACCAA ACGCTGAAAT AACAGCATCA AATCCAGTAA GATCATCTTT TGTCAACTCA 240 

AATAAATCTT TTTTAATAAT AGACTCAGCT TGACTTTTGT TTTCAGAACG AACAATAGCC 300 

GTTACTTCAT GTCCTCGTTT GACTGCTTCT TCAACAATTG CTTTCCCCGC TTGTCCATTT 360 

GCTGCAATAA CTGCTAGTTT CATTTTTTAT ACCTCTCTTG TTGTAATTAT TTTAGTTACA 420 

GAAATTGTGA CACTCTTAAT AATCAATGTC AATAGTCTTG CTTAATTATT ATCAAAATAT 480 

TTCTACCAAG AAAACTAACC ATGATTCTAG TGAAAAAAAA TCTTCTTTGT CAACAAATTT 540 

ACTTTCTTGT TTTAAACATG CTATAATAAT CATAGCAAGA GATCTAAGTT GTCTGTTTTT 600 

TTAAAACGAG GTGATTATCA TGCGTAGATT CTATTCCCAT CTCCCCTACT ATCTGGTCAT 660 

ATTATTCTTT TATTGGCCAC TTTATGAGTT GTTCTTACTA GTTGTTTCTG ACCCCCTTAC 720 

ACTCAAGGGA CTCTATATAA ACAATCTTCT CTTCTTTACA CCTCTGGTAA TCTTGATTGT 780 

ATCGTTACTC TATAGCTACC GTTTCCGTTT CTCACTTTGA TGGTTAGTTG GTAACGGACT 840 

GCTCTTTTAC TTTACTATCA TAACCTTTGG TGAGTTTATA CTAATTTACT TGCTAATCTA 900 

TGAAACAGTT GCTCTGGTCG GCATGGATTC TGGTATTAGC ATCAAGCATA TTCTACAAAA 960 

AATGAAAAAC AAAAAACTTT CACAAAATCC TTGAAAAATC TCACAATCAT GCTATAATAA 1020 

TCCATAGAGA CAAGTCACTT AGTCCCTTTC TACTAGAGAG TGCGTGGTTG CTGGAAACGC 1080 

ATAGGAAGTC TAAACTGATA CTACTCTTGA GTTTTTTATG AAAACATAAA ACGGTGGCCA 1140 

CGTTAGAGCC GATCAGAGGT GTCCCTCTCT TTTGAGGTAC ATAAATGAAG GTGGAACCAC 1200 

GTTGCGACGT CCTTTCGAGG ATGTCGCATT TTTTTATTAG GATACTAATT ATGGAGTTGC 1260 

AAGAATTAGT GGAGCGCAGT TGGGCAATCC GACAAGCTTA TCAC6AACTG GAAGTTAAGC 1320 

ATCATGATTC CAAGTGGACG GTAGAAGAAG ACCTCTTGGC TTTATCTAAT GATATTGGAA 1380 

ATTTCCAACG ACTGGT6ATG ACAAAGCAAG GACGCTACTA T6ATGAAACA CCCTACACAC 1440 

TGGAACAAAA ACTTTCAGAA AATATCTGGT GGCTATTAGA ACTTTCTCAA CGTTTGGATA 1500 

TAGACATTCT GACGGAAATG GAAAACTTCC TCTCTGATAA AGAAAAGCAA TTGAACGTTA 1560 

GGACTTGGAA GTAGTCTGCT GATAAAAAAT CAATGCTTAG AAACTATGAA ATAATAAAAA 1620 
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AGGAGAACAT CATGATTAAC ATTACTTTCC CAGATGGCGC TGTTCGTGAA TTCGAATCTG 1680 

GCGTAACAAC TTTTGAAATT GCCCAATCTA TCAGCAATTC CCTAGCTAAA AAAGCCTTGG 1740 

CTGGTAAATT CAACGGCAAA CTCATCGACA CTACTCGCGC TATCACTGAA GATGGAAGCA 1800 

TCGAAATTGT GACACCTGAT CACGAAGATG CCCTTCCAAT CTTGCGTCAC TCAGCAGCTC 1860 

ACTTGTTCGC CCAAGCAGCT CGTCGTCTTT TCCX^IGACAT TCACTTGGGA GTTGGTCCAG 1920 

CCATCGAAGA TGGTTTCTAC TACGATACTG ACAACACAGC TGGTCAAATC TCTAACGAAG 1980 

ACCTTCCTCG TATCGAAGAA GAAATGCAAA AAATCGTCAA AGAAAACTTC CCATCTATTC 2040 

GTGAAGAAGT GACTAAAGAC GAGGCACGTG AAATCTTCAA AAATGACCCT TACAAGTTGG 2100 

AATTGATTGA AGAACACTCA GAAGACGAAG GCGGTTTGAC TATCTATCGT CAGGGTGAAT 2160 

ATGTAGACCT CTGCCGTGGA CCTCACGTTC CATCAACAGG TCGTATCCAA ATCTTCCACC 2220 

TTCTCCATGT AGCTG6TGCG TACTGGCGTG GAAACAGCGA CAACGCTATG ATGCAACGTA 2280 

TCTACXXyTAC AGCTTGGTTT GACAAGAAAG ACTTGAAAAA CTACCTTCAA ATGCGTGAAG 2340 

AAGCTAAGGA ACGTGACCAC CGTAAACTT6 GTAAAGAGCT TGACCTCTTT ATGATTTCAC 2400 

AAGAAGTGGG ACAAGGTTTG CCATTCTGGT TGCCAAATGG TGCGACTATC CGTCGTGAAT 2460 

TGGAACGCTA CATCGTAAAC AAAGAGTTGG TTTCTGGCTA CCAACACGTC TACACTCCAC 2520 

CACTTGCTTC TGTTGAGCTT TACAAGACTT CTGGTCACTG GGATCATTAC CAAGAAGACA 2580 

TGTTCCCAAC CATGGACATG GGTGACGGGG AAGAATTTGT CCTTCGTCCA ATGAACTGTC 2640 

CGCACCACAT CCAAGTTTTC AAACACCATG TTCACTCTTA CCGTGAATTG CCAATCCGTA 2700 

TCGCTGAAAT CGGTATGATG CACCGTTACG AAAAATCTGG TGCCCTCACT GGCCTTCAAC 2760 

GTGTACGTGA AATGTCACTC AACGACGGTC ACCTATTCGT TACTCCAGAA CAAATCCAAG 2820 

AAGAATTCCA ACGTGCCCTT CAGTTGATTA TCGATGTTTA TGAAGACTTC AACTTGACTG 2880 

ACTACCGCTT CCGCCTCTCT CTTCGTGACC CTCAAGATAC TCATAAGTAC TTTGATAACG 2940 

ATGAGATGTis GGAAAATGCC CAAACCATGC TTCGTGCAGC TCTTGATGAA ATGGGCX3TGG 3000 

ACTACTTTGA AGCCGAAGGT GAAGCAGCCT TCTACGGACC AAAATTGGAT ATCCAGATTA 3060 

AAACTGCCCT TGGAAAAGAA GAAACCCTTT CTACTATCCA ACTTGATTTC TTGTTGCCAG 3120 

AACGCTTCGA CCTCAAATAC ATCGGAGCTG ATGGCGAAGA TCACCGTCCA GTCATGATCC 3180 

ACCGTGGGGT TATCTCAACT ATGGAACGCT TCACAGCTAT CTTGATTGAG AACTACAAGG 3240 

GGGCCTTCCX: AACATGGCTG GCACCACACC AAGTAACCCT CATCCCAGTA TCTAACGAAA 3300 

AACACGTGGA CTACGCTTGG GAAGTGGCCA AGAAACTCCG TGACCGCGGT GTCCGTGCAG 3360 
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ACGTAGATGA GCGCAATGAA AAAATGCAGT TCAAGATCCG TGCTTCACAA ACCAGCAAGA 3420 

TTCCTTACCA ATTAATTGTT GGAGACAAAG AAATGGAAGA CGAAACAGTC AACGTTCGTC 3480 

GCTACGGCCA AAAAGAAACA CAAACTGTCT CAGTTGATAA TTTTGTTCAA GCTATCCTAG 3540 

CTGATATCGC CAACAAATCA CGCGTTGAGA AATAAGAGTC TAGCATAAAA GCCTCCAATC 3600 

TGGAGGCTTT TTCTCATCTA TTTTTACTCA AGGACTAAGT TCACTTGAGC AAACTGAATC 3660 

CGCACTGTCG TTCCTTTTCC GACCTCAGAC TCGATACGAA TCTGGTGCCC CAGTTCTTCA 3720 

GAAATTTTCT TAGATAGATA AAGGCCAAGT CCAGAGGACT GCTGGGTCAA ACGGCCATTG 3780 

TATCCTGAAA AGCCACGTTC AAATACTCGG AGGACATCAC TGTTTTTTAT CCCGATTCCC 3840 

GTATCTTTGA TACAAAGCTC TTGGTCATCC ATATAAATCT CCAGACCACC TTCCTTGGTG 3900 

TACTTGAGAC TGTTTGAGAT GATTTGCTCA ATAACCACTA GCAGCCACTT TTTATCCGTC 3960 

ACGATTTCTT TATCAAGGTC ATGTAGATTG ACATTTAAGC CTTTTTGAAT AAAGAAAAGA 4020 

GCATATTTAC GAATTATTTC CTTGACCAAG TCCTCAATTT GAACCTGCTT TAAGACCAAA 4080 

TCATCATGGA AACTTTCTAA ACGCAGGTAC TGTAAAACTA GGTTGGTATA GGAGTCGATT 4140 

TTGAAAATTT CCTGTTCTAG CTGCTGCTTC AGTTGGCGGT CGACCACTTC TGCAACTAAG 4200 

AGTTGACTGG CTGCAATGGG GGTCTTTATC TGATGGACCC ACAAGGTATA GTAATCCAGC 4260 

AAATCCGTCA GTTTTCTTTC TGCTTTTGAC CTCTGCTGAT AGAGTTCCAT CTCACGCGCT 4320 

TCTAATTTTT CTGCTAAAGC TATTTCCAAA GGAGACTTGG CTTCCCTCTC TCCATAGAGA 4380 

AGTTCCTGGC GATAGACCTG CGTTTCCACC AATATGTCCC AAGTGAAAAA TAATATGGTT 4440 

ACAAAGCAAC ACAAGAAGAA AAAGTAGAGG AA6TAAATTC CTAGACTGGC AAATAAAAAC 4500 

TGAAAGAGTA AGACAAGAAA TGCCAAAGAA AGCAGATAGA TAAAAAGACG ACTACGGGAG 4560 

CGCAGATAGG CTAGAAAAAA TTGTTTCCAA TCAAGCATGC TTCAATCCGT ACCCTATTCC 4620 

TTTCTTGGTC TCGATAAATC CTACCAATCC CTGCTCCTCC AACTTTTTAC GCAAACGAGC 4680 

CACATTGACA GAGAGGGTAT TATCATCAAT GAAAAAGTCA CTGTTCCAAA GTTCCCGCAT 4740 

CAGGTCGTCA CGTGCTACGA TGTTGCCTGC ATGCTCAAAT AACACX3CGTA AAATCTGGAA 4800 

TTCATTCTTG GTCAAATTCA A6ACTT6CCC TTGATAATGT AAATCCATGG ATTTGGTATT 4860 

GAGGATAACA CCAGCATATT CCAGCAAACT CTCATCACGC CCAAACTCAT AGGAACGACG 4920 

CAACAAGCCC TGAACCTTAG CTAAAAGAAC CTGCTGGTCA AAAGGCTTGG TCACAAAGTC 4980 

ATCCGCCCCC ATATTGATTG CCATGACAAT ATCCATAGCC TGGTCTCTCX3 AAGAAAGAAA 5040 

CATGATAGGT ACCTTGGAAA TCTTGC6GAT TTCCTGACAC CAGTGATAAC CATTAAACAA 5100 

GGGCAAACCA ATATCCATGA GGACCAGATG AGGTTCCGAC TGAACAAATA GACTCAAAAC 5160 
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TTCCATAAAG 


TCTTCTACCA GGACCACTTC 


AAATCCCCAT 


TCAGAGAGCA 


TTTTCCCAAT 


5220 


CTGTTGACGA ATGACCTGAT CATCTTCTAT TAATAAAATC 


TTGTGCATGC 


GCTTCTCCTT 


5280 


TTCCATTATT 


ATAACAGATT 


TTTCCATGCT 


AGATGGTCTG 


AAACTGAATT 


TGAAATAGCC 


5340 


TGTTTTTAGC 


CAGTACAAAC 


AGGCTATGCT 


ACTAGCTAAT 


TTGAGGGAAA 


TTTGCTAAGA 


5400 


TAAATAAAAA 


GAAAGGAGCT 


CTTATGGCCA 


ATATTTTTGA 


CTATCTGAAA 


GATGTCGCAT 


5460 


ATGATTCTTA 


TTACGACCTT 


CCCTTGAATG 


AGTTAGACAT 


TCTAACCTTA 


ATAGAAATCA 


5520 


CCTACCTCTC 


CTTTGATAAT 


CTGGTCTCCA 


CACTTCCTCA 


ACGTCTTTTA 


GATCTAGCAC 


5580 


CTCAGGTTCC 


AA6AGATCCC 


ACCATGCTTA 


CTAGCAAAAA 


TCGCCTTCAA 


TTATTAGATG 


5640 


AATTGGCTCA 


ACACAAGCGC 


TTCAAAAATT 


GCAAACTCTC 


CCATTTTATC 


AACGACATCG 


5700 


ACCCTGAACT 


GCAAAAGCAA 


TTTGCGGCTA 


TGACTTATCG 


TGTCAGCCTC 


GATACCTATC 


5760 


TGATTGTCTT 


TCGTGGGACA 


GATGACAGTA 


TCATTGGCTG 


GAAGGAAGAT 


TTCCACCTGA 


5820 


CCTATATGAA 


GGAAATTCCT 


GCTCAAAAGC 


ACGCCCTTCG 


CTATTTAAAG 


AACTTTTTTG 


5880 


CCCATCATCC 


TAAGCAAAAG 


GTTATTCTAG 


CTGGGCATTC 


CAAGGGAGGA 


AATCTCGCTA 


5940 


TCTATGCTGC 


TAGCCAAATT 


GAGCAAAGTT 


TGCAAAATCA 


GATCACAGCA 


GTTTATACAT 


6000 


TTGATGCACC 


TGGTCTCCAT 


CAAGAATTGA 


CACAGACTGC 


GGGTTATCAA 


AGGATAATGG 


6060 


ATAGAAGCAA 


GATATTCATT 


CCACAAGGTT 


CCATTATCGG 


TATGATGCTG 


GAAATl'CCTG 


6120 


CTCACCAAAT 


CATCGTTCAG 


AGTACTGCCC 


TGGGTGGCAT 


CGCCCAGCAC 


GATACCTTTA 


6180 


GTTGGCAGAT 


TGAGGACAAG 


CACTTCGTCC AACTGGAT^ GACCAACAGT GATAGCCAGC 


6240 


AAGTAGACAC 


AACCTTTAAA GAATGGGTGG 


CCACAGTCCC 


TGACGAAGAA 


CTTCAGCTCT 


6300 


ACTTCGACCT 


CTTCTTTGGC 


ACTATTCTTG ATGCTGGTAT 


TAGCTCTATC 


AATGACTTGG 


6360 


CTTCCTTAAA 


GGCGCTTGAA 


TACATTCATC 


ATCTCTTTGT 


CCAAGCTCAA TCCCTCACTC 


6420 


CAGAAGAAAG 


AGAAACCTTG 


GGTCGCCTTA 


CCCAGTTATT 


GATTGATACT 


CGTTACCAGG 


6480 


CATGGAAAAA 


TAGATAATAC 


TCTTGAAAAT 


TAAATGTATA 


CAAAACAAAA 


GACCTAGAAT 


6540 


ACATACTTTC 


ATGTGCATTC 


TAAGTCTTTT 


TAAATAGAAT 


CTAATAGTCA 


ATAAAAATCA 


6600 


AAGAGCATTG 


AGAGATAATG 


GGGCTTGGAA 


CGTCCCTCTC 


GCTTCAACAA 


AATGACCCCA 


6660 


TTATAGATTA AAAAGATGCC 


ACTTAGAAAA 


AGC/^AAAAAG 


GAAGTAAGAC 


AAAGGCAAAT 


6720 


ATATAAAAAG 


CTAACTGAAC 


ATTCTCGTAT 


CCATTTTTAT 


AAAAAAGGTA 


GGATAGATAA 


6780 


AAATAACTTG 


AAATGAGGGA 


TAATAAAAAT 


AATACTGGAT 


TCCACAAACT 


TCTATTATCC 


6840 


TTCCAAAATG 


ACACTATAAA 


GGCTAATACA ATTCCTATAA 


CGAGATACAT 


TTCTTACTCC 


6900 
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TTTAATAGCT ACATTTTATC ATAATTATCC AAAGAAAAAA GAGGGCATTT ATCCCTCTTA 6960 

ATCCTTCATC TGACTCTCTG CATCGGCCAC GACTTTTTCT AGACTGGTTT GACCAAGTTC 7020 

TGCCTCCATA GTCAACTGAA TTCTCTCCAA TTTTTGATCC AAAACATCAT GAATATGAGC 7080 

TCCTACAGGG CAATTTGGAT TCGGATTGTC ATGGAAACTG AAGAGTTGAC CTGTCTTACC 7140 

AAGACATTCG ACCGCCTGAT AAACATCTAA AAGACTAATA TCCTTAAGGT CCTTGACAAT 7200 

CTCTGTTCCG CCCGTTCCAC GCGCTACTGA AATCAGCTCT GCCTTCTTCA ACTGGGACAA 7260 

GATCTTTCTG ATAATCACAG GATTGACCCC GACACTAGCA GCCAGAAAAT CACTGGTCAC 7320 

CTTGCTTTCC TTCCCCTCGA GGGCAATGAT TATCAGCATA . TGAGTCGCAA TGGTAAATCT 7380 

ACTTGGAATT TGCATCCTCT TCTCCTTTTT AC6AGGCTAC CCTGCCTCTA CTCTTCTTTT 7440 

TCTATTATTA TACCCTTTTT AGTTGTAATG TCAATCGTTA CCACTTTTCA ACCAGTCGTC 7500 

TAACTCCCGA TCGCAGCCCT CTTTCTGAGC CAATTCTCTC AAAAATTCCT GATGATGAGT 7560 

ATGGTGGATC CCATTGACCA GACTTTCATA GTAAACCTCA AAATAGGGAA GTCTCAGGTC 7620 

TTTAGCCAGC TGCAATTCAG CTGCTACATC GTAGTCTACC CGTCGGAAGT CCATATCTAC 7680 

CAGGCCTTTG TCATCAAACT CCAAAATCAT ATACTGGGCC CGCAAGTCCT TCCGTAGCTG 7740 

AGCGTCCAAA AAGAAAGGTT GGCCAATCGA ACCCGGATTG ACAATCAATT GCCCACCAGT 7800 

CCCGTAACGA AGCAACTGCT GGTGAATATG TCCATAAACA GCAATATCAC AGGGAGGA7G 7860 

AGTCACCAAG CGGTCAAACT CCTCTTGTTT GCCA6TATGA ATCAACTCTC GCCCCCAGTP 7920 

CTTATCAGGC AGATGATGGC TAATTCCCAC CGTCAAATCC CCAAACTGAC GATGAATTTG 7980 

AAGAGGTTGA TTGT6GAGCA CTTCAATTTC TTCTAGGGAA ATTTCCTCTA AAACATACTG 8040 

GCACTGGCGC AAGAGATAGC GTTGACTGGG GCGAGTACTG TCCAATTCCT TACGGACACC 8100 

ATGCCAAAGA CTGTCTTCCC AGTTTCCCAA AACTCTAGCC GTAATCGGTA GTTGATCCAA 8160 

CAAGTCCAAA ATCCTTCTAC GCCCTGTCCC TGGCATGAGA ATATCTCCCA AAAGCCAGTA 8220 

TTCATCCACT CCTATCTGCC GAGCATCTGC CAAAACAGCC TCCAAGGCGG TGGTATTTCC 8280 

ATGAATATCT GAAAGAAGAG CTATTTTCGT CATATCCATC TCCTCGTTTT TTCTCTTGCA 3340 

ATAAGTATAA CATAAAAAGT CACAGCTA6A 6AAATCTAGC TTTTTTTGAT ATACTAGATA 8400 

AAGATATTAG ACAAGAGGAA AC6AATGACC CCAAACAAAG AAGACTATCT AAAATGTATT 8460 

TATGAAATT6 GCATAGACCT GCATAAGATT ACCAACAAGG AAATTGCGGC TCGCATGCAA 8520 

GTCTCTCCCC CTGCCGTAAC TGAAATGATC AAACGAATGA AAAGTGAAAA TCTCATCCTA 8580 

AAGGACAAGG AATGTGGCTA TCTACTGACT GACCTCGGTC TCAAACTGGT CTCTGAGCTC 8640 

TATCGTAAGC ACCGCTTGAT TGAAGTTTTT CTAGTTCATC ATTTAGACTA TACAAGTGAC 8700 
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CAGATTCACG 


AGGAAGCTGA GGTCTTGGAA CACACTGTCT 


CTGACCTGTT 


CGTGGAAAGA 


8760 


CTAGATAAAC 


TGCTAGGTTT 


CCCTAAAACC TGCCCCCACG 


GGGGAACTAT 


TCCTGCCAAG 


8820 


GGAGAACTAC 


TCGTT6AAAT 


CAATAACCTC CCACTAGCTG 


ATATCAAGGA 


AGCTGGCGCC 


8880 


TACCGCCTGA 


CTCGGGTGCA 


CGATAGTTTT GACATTCTCC 


ATTATCTGGA 


CAAGCACTCA 


8940 


CTTCACATCG 


GT6ACCAGCT 


CCAAGTCAAG CAGTTTGATG 


GCTTCAGCAA 


TACCTTCACT 


9000 


ATCCTCAGTA 


ACGACGAGGA 


TTTACAAGTG AATATGGACA 


TTGCAAAACA 


ACTCTATGTC 


9060 


GAGAAAATCA 


ACTAATTTCT 


CAAGTCCCCT ACCAACCCTG 


AAAGTTTTAT 


TTTGGCTCTT 


9120 


TGTCAACTGT 


AGTGGGTTGA AGTCAGCTAA GCTCGAGAAA 


GGACAAATTT 


TGTCCTTTCT 


9180 


TTTTTGATAT TCAGAGCGAT AAAAATCCGT TTTTTGAAGT TTTCAAAGTT CCGAAAACCA 


9240 


AAGGCATTGC 


GCTTGATAAG TTTGATGAGA TTATTGGTCG 


CTTCCAGTTT GGCATTAGAA 


9300 


TAGTGTAGTT 


GAAGGGCGTT 


GACAATCTTT TCTTTATCTT 


TGAGGAAGGT 


TTTAAAGACA 


9360 


GTCTGAAAAA 


TAGGATGAAC 


CTGCTTTAGA TTGTCCTCAA TGAGTCCGAA 


AAATTTCTCC 


9420 


GGTTTCTTAT 


TCTGAAAGTG 


AAACAGCAAG AGTTGATAGA 


GCTGATAGTG 


CTGITTCAAG 


9480 


TCTTGTGAAT 


AGCTCAAAAG 


CTTGTCTAAA ATCTCTTTAT 


TGGTTAAGTG 


CATACGAAAA 


9540 


GTAGGACGAT 


AAAATCGCTT 


ATCACTCAGT TTACGGCTAT 


CCTGTTGTAT 


GAGCTTCCAG 


9600 


TAGCGCTTGA 


TAGCCTTGTA 


TTCATGGGAT TTTCGATCCA 


ATTGGTTCAT 


TIATTTGAACA 


9660 


CGCACACGAC 


TCATAGCACG 


GCTAAGATGT TGTACAATGT 


GAAAGCGATC 


CAACACGATT 


9720 


TTAGCATTCG 


GGAGTGAAAC 


AGTCTGGGAG ACTGTTTCAG 


CCTGAGCCTA 


GAAATTTGAA 


9780 


A6CGAAGCTG 


TTTAGCCAAG 


TCATAGTAAG GACTAAACAT 


ATCCATCGTA 


ATGATTTTCA 


9840 


CTTGACAACX; 


AACGGCTCTA 


TCGTAGCGAA GAAAGTGATT 


TCGGATGACA 


GCTTGTGTTC 


9900 


TGCCTTCAAG 


AACAGTGATA 


ATATTAAGAT TATCAAAATC 


TTGCGCAATG 


AAACTCATCT 


Qocn 
7!rOU 


TTCCCTTAGT 


GAAG6CATAC 


TCATCCCAAG ACATAATCTT 


TGGAAGCCGA 


GAAAAATCAT 


10020 


GCTCAAAGTG 


AAAGTCATT6 


AGCTTGCGAA TGACAGTTGA AGTTGAAATG 


GCCAGCTGAT 


10080 


GGGCAATATC 


AGTCATAGAA 


ATTTTTTCAA TTAACTTTTG 


AGCAATyTTT TGGTTGATGA 


10140 


TACGAGGGAT 


TTGGTGATTT 


TTCTTTACCA GGGGAGTCTC 


AGCAACCATC 


ATTTTTGAAC 


10200 


AGTGATAGCA 


CTTGAAACGA 


CGCTTTCTAA GGAGAATTCT 


AGAAGGCATA 


CCAGTCGTTT 


10260 


CAAGATAAGG 


AATTTTAGAA GGTTTTTGAA AGTCATATTT 


CTTCAATTGG 


TTTCCGCACT 


10320 


CAGGGCAAGA 


TGGGGCGTCG 


TAGTCCAGTT TGGCX3ATGAT 


TTCCTTGTGT 


GTATCCTTAT 


10380 


TGATGATGTC 


TAAAATCTGG 


ATATTAGGGT CTTTAATGTC 


TAGTAATTTT 


GT6ATAAAAT 


10440 
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GTAATTGTTC CATATGATTC TTTCTAATGA GTTGTTTTGT CGCTTTTCAT TATAGGTCAT 10500 

ATGGGACTTT TTTTCTACAA TAAAATAGGC TCCATAATAT CTATAGTGGA TTTACCCACT 10560 

ACAAATATTA TAGAACCGTA AAAATAGAAG GAGATAGCAG GTTTTCAAGC CTGCTATCTT 10620 

TTTTTGATGA CATTCAGGCT GATACGAAAT CATAAGAGGT CTGAAACTAC TTTCAGAGTA 10680 

GTCTGTTCTA TAAAATATAG TAGATTGAAA TAAGATGTGA ACAACTCTAT CAGGAAAGTC 10740 

AAATTAATTT ATAGAATTAT TTTAGCAGTC AAGGTGTACT GTTATAGATT CAATATATTA 10800 

TATGACTATT AACCTTGTCT TCTCCTAAAA TTGACTTTCT TGTTTTCTTA TCTTGTCCAC 10860 

TCGAAACAAG TATTGTAAGA ATTTGATTAT TTTTGAAAGT ACTTTTAATA TACTT6ATAT 10920 

AGTTAAAAAA GATTTGAAAC TAAATTCCAA ATTAGAAAAA GACTTGAAAT ACTAAAAAAA 10980 
AAAAAGTATA CTCTAATTGA AAACGGTAAC AAAACTAATT TAGA6AATGA AATATAGA6T - 11040 

ATTTCTCTCT TAAAAGTTTT TGGTGAAACG AGATGTAGAA AGGAGATTTA GCCAAAGAGT 11100 

CTATTAGTGC TAGAATAATA GATTAGAATT ATTTTAGAAA AACGAAGTGA GCAGCTTATA 11160 

AATTCAAGTC CCCAAATAGA TTCATACTAG TATCTTTTGC AAAAAATAAA GGGCGACTTC 11220 

CTTCATGAAT ATCAATTTCA TCTATAAGGA AGGTAGCTAA TTGAACTAAC TTATTTATTC 11280 

TGTTTGTCGC TAGAAAAATC AGACCTCCTT GTGAAGATTG AGGAGATACT TAATGAAAAT 11340 

CAAAGAAGAA ACTAGCAAGC TAGTA6CAGA TTGCCCAAAA CACCGCTTTG AGGTTGTAGA 11400 

TAAGACTGAC CTATATAATC CAAGGTGAAG CGACTGTGGT TTGAAGAGAT TTTCAAAGAG 11460 

TATAGGCTAG AGAGTAGT6T TTTTATGTCC TTCTAGTAGA AAATGCTAGA CAGAAGAATG 11520 

GG6AACTTGG ATAGGAAAAA TAGATTGAGA AAGGAGGTTA GAAGAGATGA TTATTACAAA 11580 

AATTAGCCGT TTAG6AACTT ATGTGGGAGT AAATCCACAT TTTGCAACAT TAATAGATTT 11640 

TCTAGAAAAA ACAGGACTAG AAAATTTAAC AGAAGGTTCG ATTGCTATCG ATGGTAATCG 11700 

ATTGTTTGGG AATTGCTTTA CTTATCTAGC AGATGGTCAA GCAGGGGCTT TCTTTGAAAC 11760 

CCACCAAAAA TATTTGGATA TTCATTTAGT TTTGGAAAAC GAAGAAGCCA TGGCTGTTAC 11820 

ATCGCCGGAA AATGTAAGCG TTACCCAAGA ATATGATGAA GAGAAAGATA TTGAATTATA 11880 

CACAGGGAAA GTGGAACAQT TGGTTCATTT GAGAGCTGGC GAATGCCTCA TCACTTTTCC 11940 

AGAAGATTTA CATCAACCCA AGGTTCGTAT AAATGATGAA CCTGTGAAAA AAGTTGTCTT 12000 

TAAAGTTGCX3 ATTTCTTAAT GTAGAAAGAG AAGAACGATG AAAAAAATGA GAAAGTTTTT 12060 

ATGTCTAGCT GGAATTGCGC TAGCGGCTGT TGCCTTGGTA GCTTGTTCAG GAAAAAAAGA 12120 

AGCTACAACT AGTACT6AAC CACCAACAGA ATTATCTGGT GAGATTACAA TGTGGCACTC 12180 

CTTTACTCAA GGACCCGGTT TAGAAAGTAT TCAAAAATCA GCAGATGCTT TCATGCAAAA 12240 
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GCATCCAAAA 


ACGAAAATCA AGATTGAAAC 


ATTTTCTTGG AATGACTTCT 


ATACTAAATG 


12300 


GACTACAGGT 


TTAGCAAATG GAAATGTGCC 


AGATATCAGT ACAGCTCTTC 


CTAACCAAGT 


12360 


AATGGAAATG 


GTCAACTCAG ATGCTTTGGT 


TCXTGCTAAAT GATTCTATCA 


AGCGTATTGG 


12420 


ACAAGATAAA TTTAACGAAA CTGCCTTAAA TGAAGCAAAA ATCGGAGATG ATTACTACTC 


12480 


TGTTCCTCTT 


TATTCACATG CACAAGTCAT 


GTGGGTTAGA ACAGATTTGT 


TAAAAGAACA 


12540 


TAATATTGAG 


GTTCCTAAAA CTTGGGATCA 


ACTCTATGAA GCTTCTAAAA 


AATTGAAAGA 


12600 


AGCTGGAGTT 


TATGGCTTGT CTGTTCCGTT 


TGGAACAAAT GACTTAATGG 


CAACACGTTT 


12660 


CTTGAACTTC 


TACGTACGTA rTGGTGGAGG 


AAGCCTCTTA ACAAAAGATC 


TTAAAGCAGA 


12720 


CTTGACAAGC 


CAACTTGCTC AAGATGGTAT 


TAAATACTGG GTTAAATTGT 


ATAAAGAAAT 


12780 


CTCACCTCAA 


GATTCTTTGA ACTTTAATGT 


CCTTCAACAA GCTACCTTGT 


TCTATCAAGG 


12840 


AAAAACAGCA 


TTTGACTTTA ACTCTGGCTT 


CCATATCGGA GGAATTAATG 


CCAACAGTCC 


12900 


TCAATTGATT 


GATTCGATTG ATGCTTATCC 


TATTCCAAAA ATCAAAGAGT 


CTGATAAAGA 


12960 


CCAAGGAATT 


GAAACCTCAA ACATTCCAAT 


GGTTGTTTGG AAAAATTCAA 


AACATCCAGA 


13020 


AGTTGCTAAA GCATTCTTAG AAGCACTTTA 


TAATGAAGAA GACTACGTTA AATTCCTTGA 


13080 


TTCAACTCCA 


GTAGGTATGT TGCCAACTAT 


TAAGGGGATT AGCGATTCTG 


CAGCCTATAA 


13140 


AGAAAATGAA 


ACTCGTAAGA AATTTAAACA 


TGCTGAAGAA GTAATTACTG 


AAGCTGTTAA 


13200 


AAAAGGTACT GCTATTGGTT ATGAAAATGG 


GCCAAGTGTA CAAGCTGGTA 


TGTTGACTAA 


13260 


CCAACACATT ATTGAACAAA TGTTCXTAAGA TATCATTACA AATGGAACAG ATCCTATGAA 


13320 


AGCAGCAAAA GAAGCAGAAA AACAATTAAA TGATTTATTT GAGGCTGTTC AGTAGATGTA 


13380 


AAAGACTAGA 


AAATAGGTGG GATAGTGAGC 


TGAAAAGCTC TAGCCCAATC 


TTGTAAAAGA 


13440 


AGGGAGAAGG AGAATGGTTA AAGAACGTAA 


TTTAACTCGC TGGATATTTG 


TTTTGCCAGC 




TATGATTATC 


GTAGGATTAC TCTTTGTTTA 


TCCGTTTTTC TCGAGTATTT 


TTTATAGCTT 


13560 


TACCAATAAG 


CATTTGATTA TGCCTAATTA 


TAAATTTGTT GGTTTGGCTA 


ACTATAAAGC 


13620 


TGTGCTATCA 


GATCCCAACT TCTTTAATGC 


GTTCTTTAAT TCAATTAAGT 


GGACCGTTTT 


13680 


CTCATTAGTT 


GGTCAAGTTT TAGTAGGGTT 


TGTATTGGCT TTAGCTCTTC 


ACAGAGTACG 


13740 


CCACTTCAAG 


AAATTATATA GGACATTATT 


GATTGTTCCT TGGGCATTTC 


CTACCATCGT 


13800 


TATTGCCrrC TCTTGGCAGT GGATTCTAAA CGGGGTTTAT GGCTACTTAC CTAATCTAAT 


13860 


CGTAAAATTA GGTTTAATGG AACATACACC TGCATTTTTG ACAGATAGTA 


CATGGGCATT 


13920 


CCTATGTTTG 


GTCTTTATCA ACATTTGGTT 


TGGAGCACCA ATGATTATGG 


TTAATGTGCT 


13980 
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TTCAGCTTTG CAAACAGTAC CAGAAGAACA ATTTGAGGCT GCTAAGATAG ATGGTGCTTC 14040 

AAGTTGGCAG GTGTTCAAGT TTATCGTCTT TCCACATATT AAAGTGGTTG TAGGACTTCT 14100 

AGTTGTTTTG AGAACTGTAT GGATCTTTAA TAACTTTGAC ATTATCTACC TCATTACTGG 14160 

TGGTGGACCA GCCAATGCTA CAACGACGCT TCCAATTTTT GCTTACAACC TGGGCTGGGG 14220 

AACTAAATTG TTGGGTCGTG CTTCAGCAGT TACAGTACTG CTCTTTATCT TCTTGGTGGC 14280 

GATTTGCTTT ATCTACTTTG CTATCATCAG TAAGTGGGAA AAGGAGGGTA GAAAATAATG 14340 

AAGAAGAAAT CCAGTATTTA TTTAGATATT CTCTCACATG TACTTTTAGT TGGTGCGACC 14400 

ATCGTTGCAG TTTTCCCATT GGTAT6GATT ATCATATCTT CTGTCAAAGG GAAAGGGGAA 14460 

TTAACTCAGT ATCCAACACG ATTTTG6CCT GAACAGTTTA CATTAGATTA TTTCACTCAT 14520 

GTTATCAACG ATTTGCACTT CATTGATAAC ATTCGAAACA GTTTAATCAT TGCCTTGGCT 14580 

ACAACCCTTA TTGCGATTAT TATTTCTGCT ATGGCAGCCT ATGGTATTGT TCGATTCTTT 14640 

CCTAAATTGG GAGCAATCAT GTCGAGACTA CTCGTCATTA CCTACATTTT CCCACCAATT 14700 

TTGTTAGCAA TTCCCTATTC AATTGCCATT GCTAAAGTTG GGTTAACAAA TAGTTTATTT 14760 

GGCTTGATGA TGGTTTATCT ATCTTTTAGT GTTCCATATG CAGTTTGGCT CTTAGTTGGA 14820 

TTTTTCCAAA CAGTTCCAAT TGGAATTGAA GAAGCGGCTA GAATTGATGG TGCAAATAAA 14880 

TTTGTTACGT TTTATAAAGT TGTGCTACCG ATTGTAGCAC CAGGTATTGT AGCAACAGCT 14940 

ATTTATACAT TTATCAATGC TTGGAATGAA TTCCTGTATG CCTTGATTTT GATTAACAAT 15000 

ACAGGAAAGA TGACAGTAGC AGTAGCCCTT CGTTCACTTA ATGGTTCAGA AATACTAGAC 15060 

TGGGGAGATA TGATGGCAGC GTCTGTTATT GTAGTTCTTC CATCAATTAT TTTCTTCTCT 15120 

ATCATCCAAA ATAAGATTGC AAGTGGATTA TCAGAAGGAT CTGTGAAGTA GACGAAAGAA 15180 

GGAAAAAAAT GAATAAAAGA GGTCTTTATT CAAAACTAGG AATTTCCGTT GTAGGCATTA 15240 

GTCTTTTAAT GGGAGTCCCC ACTTTGATTC ATGCGAATGA ATTAAACTAT GGTCAACTGT 15300 

CCATATCTCC TATTTTTCAA GGAGGTTCAT ATCAACTGAA CAATAAGAGT ATAGATATCA 15360 

GCTCTTTGTT ATTAGATAAA TTGTCTGGAG AGAGTCAGAC AGTAGTAATG AAATTTAAAG i5420 

CAGATAAACC AAACTCTCTT CAAOCTTTGT TTGGCCTATC TAATAGTAAA GCAGGCTTTA 15480 

AAAATAATTA CTTTTCAATT TTCATGAGAG ATTCTGGTGA GATAGGTGTA GAAATAAGAG 15540 

ACGCCCAAAA GGGAATAAAT TATTTATTTT CCAGACCAGC TTCATTATGG GGAAAACATA 15600 

AAGGACAGGC AGTTGAAAAT ACACTAGTAT TTGTATCTGA TTCTAAAGAT AAAACATACA 15660 

CAATGTATGT TAATGGAATA GAAGTGTTCT CTGAAACAGT TGATACATTT TTGCCAATTT 15720 

CAAATATAAA TGGTATAGAT AAGGCAACAC TAGGA6CTGT TAATCGTGAA GGTAAGGAAC 15780 
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ATTACCTCGC 


AAAAGGAAGT ATTGATGAAA 


TCAGTCTATT 


TAACAAAGCA ATTAGTGATC 


15840 


AGGAAGTTTC 


AACTATTCCC TTGTCAAATC 


CATTTCAGTT 


AATTTTCCAA 


TCAGGAGATT 


15900 


CTACTCAAGC 


TAACTATTTT AGAATACCGA CACTATATAC 


ATTAAGTAGT 


GGAAGAGTTC 


15960 


TATCAAGTAT TGATGCACGT TATGGTGGGA CTCATGATTC 


TAAAAGTAAG 


ATTAATATTG 


16020 


CCACTTCTTA TAGTGATGAT AATGGGAAAA 


CGTGGAGTGA 


GCCAATTTTT GCTATGAAGT 


16080 


TTAATGACTA 


TGAGGAGCAG TTAGTTTACT 


GGCCACGAGA 


TAATAAATTA 


AAGAATAGTC 


16140 


AAATTAGTGG 


AAGTGCTTCA TTCATAGATT 


CATCCATTGT 


TGAAGATAAA 


AAATCTGGGA 


16200 


AAACGATATT 


ACTAGCTGAT GTTATGCCTG 


CGGGTATTGG 


AAATAATAAT 


GCAAATAAAG 


16260 


CCGACTCAGG 


TTTTAAAGAA ATAAATGGTC 


ATTATTATTT 


AAAACTAAAG 


AAGAATGGAG 


16320 


ATAACGATTT 


CCGTTATACA GTTAGAGAAA 


ATGGTGTCGT 


TTATAATGAA 


ACAACTAATA 


16380 


AACCTACAAA 


TTATACTATA AATGATAAGT 


ATGAAGTTTT 


GGAGGGAOGA 


AAGTCTTTAA 


16440 


CAGTCGAACA 


ATATTCGGTT GATTTTGATA 


GTGGCTCTTT 


AAGAGAAAGG 


CATAATGGAA 


16500 


AACAGGTTCC 


TATGAATGTT TTCTACAAAG 


ATTCGTTATT 


TAAAGTGACT 


CCTACTAATT 


16560 


ATATAGCAAT 


GACAACTAGT CAGAATAGAG 


GAGAGAGTTG 


GGAACAATTT 


AAGTTGTTGC 


16620 


CTCCGTTCTT 


AGGAGAAAAA CATAATGGAA 


CTTACTTATG 


TCCCGGACAA 


GGTTTAGCAT 


16680 


TAAAATCAAG 


TAACAGATTG ATTTTTGCAA 


CATATACTAG 


TGGAGAACTA 


ACCTATCTCA 


16740 


TTTCTGATGA 


TAGTGGTCAA ACATGGAAGA 


AATCCTCAGC 


TTCAATTCCG 


TTTAAAAATG 


16800 


CAACAGCAGA 


AGCACAAATG GTTGAACTGA 


GAGATGGTGT 


GATTAGAACA 


TTCTTTAGAA 


16860 


CCACTACAGG 


TAAGATAGCT TATATGACTA 


GTAGAGATTC 


TGGAGAAACA 


TGGTCGAAAG 


16920 


TTTCGTATAT 


TGATGGAATC CAACAAACTT 


CATATGGCAC 


ACAAGTATCT 


GCAATTAAAT 


16980 


ACTCTCAATT 


AATTGATGGA AAAGAAGCAG 


TCATTTTGAG 


TACACCAAAT 


TCTAGAAGTG 


17040 


GCCX3CAAGGG 


AGGCCAATTA GTTGTCGGTT 


TAGTCAATAA 


AGAAGATGAT 


AGTATTGATT 


17100 


GGAAATACCA 


CTATGATATT GATTTGCCTT 


CGTATGGTTA 


TGCCTATTCT 


GCGATTACAG 


17160 


AATTGCCAAA 


TCATCACATA GGTGTACTGT 


TTGAAAAATA 


TGATTCGTGG 


TCGAGAAATG 


17220 


AATTGCATTT 


AAGCAATGTA GTTCAGTATA 


TAGATTTGGA 


AATTAATGAT 


TTAACAAAAT 


17280 


AAAGGAGAAA 


AACATGGTTA AATACGGTGT 


TGTTGGAACA 


GGGTATTTTG 


GAGCTGAATT 


17340 


GGCTCGCTAC 


ATGCAAAAGA ATGATGGAGC 


AGAGATTACT 


CTTCTCTATG 


ATCCAGATAA 


17400 


TGCAGAGGCG 


ATTGCAGAAG AATTGGGAGC 


AAAAGTAGCA 


AGTTCCTTAG 


ATGAGTTGGT 


17460 


TTCTAGCX3AT GAAGTAGATT GTGTTATCGT CGCAACTCCA AATAATCTTC ATAAGGAACC 


17520 
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GGTTATTAAG 


GCTGCACAGC 


ATGGTAAAAA TGTTTTCTGT GAAAAACCAA TTGCGCTTTC 


17580 


TTATCAAGAT 


TGTCGCGAGA 


T6GTAGATGC GTGTAAAGAA 


AACAATGTAA 


CCTTTATGGC 


17640 


AGGACATATT 


ATGAATTTCT 


TTAATGGTGT TCATCATGCA 


AAAGAACTCA 


TTAATCAAGG 


17700 


AGTTATCGGA 


GACGTTCTAT 


ATT6TCATAC AGCTCGTAAT 


GGTTGGGAAG 


AACAACAACC 


17760 


GTCAGTATCA 


TGGAAAAAAA 


TTCGTGAAAA ATCAGGTGGT 


CACTTGTATC 


ACCACATCCA 


17820 


TGAATTGGAT 


TGCGTTCAAT 


TCCTTATGGG GGGCATGCCT 


GAAACTGTAA 


CCATGACAGG 


17880 


TGGAAATGTG 


GCCCATGAAG 


GTGAACATTT CGGTGATGAA 


GATGATATGA 


TTTTTGTCAA 


17940 


TATGGAATTT 


TCTAATAAGC 


GTTTTGCCTT GTTAGAATGG 


GGTTCAGCTT 


ATCGTTGGGG 


18000 


TGAACATTAT 


GTCTTAATCC 


AAGGAAGCAA AGGTGCCATC 


CGCTTAGACT 


TATTCAACTG 


18060 


TAAAG6AACT 


CTTAAGCTAG 


ATGGGCAAGA AAGCTATTTC 


TTGATTCACG 


AATCGCAAGA 


18120 


AGAAGATGAT 


GATCGGACTC 


GTATCTATCA TAGTACAGAG 


ATGGATGGAG 


CAATTGCTTA 


18180 


TGGTAAACCA 


GGTAAACGTA 


CTCCATTATG GCTATCATCT 


GTCATTGATA 


AAGAAATGCG 


18240 


CTATCTGCAT 


GAGATTATG6 


AAGGAGCTCC AGTATCAGAA 


GAATTTGCAA 


AACTTTTGAC 


18300 


AGGTGAAGCT 


GCCCTAGAAG 


CAATTGCTAC TGCAGATGCT 


TGTACCCAGT 


CTATGTTTGA 


18360 


AGATCGCAAA 


GTAAAATTGT 


CAGAAATTGT AAAATAAATT 


TTGGTATTCT 


CCTATTTATA 


18420 


GGTCGACTTG 


CTCCTCTGAA 


AGTACTTTTA GAGGAGCTGT 


TTGACTTTGC 


TAGTTTTTGA 


18480 


AACTGAAATC 


TATTATACTA 


CAAACTATTG AAAGCGTTTT 


AATTTTAAGG 


TATAATAATC 


18540 


TCATAGAAAT 


AAAGAAAAGG AG6AAAGAGG ATGCCACAGA TTAGCAAAGA AGCCTTGATT 


18600 


GAGCAAATCA 


AAGATGGAAT 


CATCGTTTCT TGTCAGGCTC 


TTCCTCATGA 


ACCGCTTTAT 


18660 


ACAGAAGCGG 


GAGGGGTGAT 


TCCCTTGCTG GTCAAAGCGG 


CTGAGCAAGG 


TGGAGCAGTC 


18720 


GGTATCOGAG 


CAAACAGTGT 


TCGCGATATC AAGGAAATTA AGGAAGTCAC 


TAAACTTCCA 


18780 


ATCATTGGGA 


TTATCAAACG 


TGATTATCCA CCTCAGGAAC 


CCTTCATCAC 


GGCTACTATG 


18840 


AAAGAAGTTG 


ATGAATTGGC 


AGAACTGGAC ATCGAGGTGA 


TTGCTCTGGA 


TTGTACCAAG 


18900 


CGTGAACGCT 


ACGATGGTTT 


GGAAATTCAA GAGTTCATTC 


GTCAGGTTAA 


GGAGAAATAT 


18960 


CCTAATCAGC 


TTTTGATGGC 


TGATACTAGT ATCTTCGAAG 


AAGGGCTAGC 


AGCTGTAGAA 


19020 


GCAGGAATTG 


ACTTTGTCGG 


AACAACCTTA TCAGGCTACA 


CATCCTACAG 


TCCAAAAGTA 


19080 


GAC6GTCCAG 


ATTTTGAATT 


GATTAAGAAA CTCTGTGATG 


CTGGTGTAGA TGTCATTGCA 


19140 


GAAGGAAAAA 


TTCATACACC 


AGAACAAGCC AAACAAATCC 


TTGAATATGG 


AGTGCGAGGC 


19200 


ATCGTTGTTG 


GTGGCGCCAT TACTAGACCA AAAGAGATTA CAGAACGCTT CGTTGCTAGT 


19260 


CTTAAATAAG 


ATGT6AGGGG 


GAGTTTTATG TTTAAAGTTT TACAAAAAGT TGGAAAA6CT 


19320 
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TTTATGTTAC CTATAGCTAT ACTTCCTGCA GCAGGTCTAC TTTTGC5GGAT TGGTGGTGCA 19380 

CTTTCAAACC CAACCACXSAT AGCAACTTAT CCAATACTAG ACAATAGTAT TTTTCAATCA 19440 

ATATTCCAAG TAATGAGCTC TGCAGGAGAG GTTGTATTCA GTAATTTGTC ACTACTTCTC 19500 

TGTGTGGGAT TATGTATTGG CTTAGCGAAA CGAGATAAAG GAACCGCTGC GTTAGCAGGA 19560 

GTAACTGGTT ACTTAGTTAT GACTGCAACG ATCAAAGCTT TGGTAAAACT TTTTATGGCA 19620 

GAAGGATCTG CAATTGATAC TGGAGTTATT GGAGCATTAG TTGTCGGAAT AGTTGCCGTA 19680 

TATTTGCACA ACCGATATAA CAATATTCAA TTACCTTCCG CTTTAGGATT CTTTGGAGGT 19740 

TCACGCTTCG TTCCTATTGT TACATCGTTC TCTTCTATCT TGATTGGCTT TGTCTTCTTT 19800 

GTTATTTGGC CACCTTTCCA ACAACTTCTT GTTTCTACAG GTGGATATAT TTCTCAGGCG 19860 

GGTCCAATTG GAACTTTTCT ATATGGATTT TTAATGAGAC TTTCTGGAGC AGTAGGCTTA 19920 

CATCATATAA TTTACCCTAT GTTTTG6TAT ACTGAACTT6 GTGGTGTTGA AACTGTTGCA 19980 

GGACAAACAG TGGTTGGAGC TCAAAAAATA TTTTTTGCTC AATTAGCCGA TTTGGCCCAT 20040 

TCTGGATTAT TTACAGAAGG AACAAGGTTT TTTCCAGGTC GTTTCTCAAC AATGATGTTC 20100 

GGTTTACCGG CTGCCTGTTT AGCGATGTAC CATAGTGTTC CTAAAAATCG TCGTAAAAAA 20160 

TACGCGGGTT TGTTTTTTGG AGTTGCTTTA ACATCTTTTA TTACCGGTAT TACAGAACCA 20220 

ATTGAATTTA TGTTTCTATT CGTCAGTCCG GTTCTATATG TTGTTCACGC ATTCCTTGAT 20280 

GGTGTTAGCT TCTTTATTGC AGACGTCTTA AATATTTCAA ^TAGGAAACAC ATTTTCAGGA 20340 

GGTGTAATCG ATTTCACTTT ATTTGGAATT TTGCAGGGGA ACGCTAAGAC GAATTGGGTT 20400 

CTTCA«3ATTC CATTTGGACT TATTTGGAGT GTTTTGTATT ATATTATTTT TAGATGGTTC 20460 

ATTACTCAAT TCAACGTTCT AACGCCAGGG CGAGGAGAAG AAGTAGATTC TAAAGAAATT 20520 

TCTGAATCCG CAGATTCAAC TTCAAATACT GCAGATTATT TAAAACAGGA TAGCCTACAA 20580 

ATTATCAGA6 CCTTG6GTGG ATCAAATAAT ATAGAAGATG TAGATGCTTG TGTGACACGT 20640 

TTACGTGTAG CTGTAAAAGA AGTTAATCAA GTTGATAAAG CACTTTTAAA ACAAATTGGT 20700 

GCAGTTGATG TCTTAGAAGT GAAGGGTGGC ATTCAAGCAA TCTATGGAGC AAAAGCAATC 20760 

TTATATAAAA ATAGTATTAA TGAAATTTTA GGTGTAGATG ATTAAGTACT TACTGACTTA 20820 

ATAAAAAACA GAGGAGAGl'G ATGGATGAGT AGGATGAAAT GAAATCGCAT ACAAGAAATA 20880 

AAGAACTCAT TATCCAAGTT GGATACX3CTT ATTACATAGG AGAATACAAA TGAAATTTAG 20940 

AAAATTAGCT TGTACAGTAC TTGCGGGTGC TGCGGTTCTT GGTCTTGCTG CTTGTGGCAA 21000 

TTCTGGCGGA AGTAAAGATG CTGCCAAATC AGGTGGTGAC GGTGCCAAAA CAGAAATCAC 21060 
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TTGGTGGGCA TTCCCAGTAT TPACCCAAGA AAAAACTGGT GACGGTGTTG GAACTTATGA 21120 

AAAATCAATC ATCGAAGCGT TTGAAAAAGC AAACCCAGAT ATAAAAGTGA AATTGGAAAC 21180 

CATCGACTTC AAGTCAGGTC CTGAAAAAAT CACAACAGCC ATCGAAGCAG GAACAGCTCC 21240 

AGACGTACTC TTTGATGCAC CAGGACGTAT CATCCAATAC GGTAAAAACG GTAAATTGGC 21300 

TGAGTTGAAT GACCTCTTCA CAGATGAATT TGTTAAAGAT GTCAACAATG AAAACATCGT 21360 

ACAAGCAAGT AAAGCTGGAG ACAAGGCTTA TATGTATCCG ATTAGTTCTG CCCCATTCTA 21420 

CATGGCAATG AACAAGAAAA T6TTAGAAGA TGCTQGAGTA GCAAACCTTG TAAAAGAAG6 21480 

TTGGACAACT GATGATTTTG AAAAAGTATT GAAAGCACTT AAAGACAAOG GTTACACACC 21540 

AGGTTCATTG TTCAGTTCTG GTCAAGGGGG AGACCAAGGA ACACGTGCCT TTATCTCTAA 21600 

CCTTTATAGC GGTTCTGTAA CAGATGAAAA AGTTAGCAAA TATACAACTG ATGATCCTAA 21660 

ATTCGTCAAA GGTCTT6AAA AAGCAACTAG CTGGATTAAA GACAATTTGA TCAATAATGG 21720 

TTCACAATTT GACGGTGGGG CAGATATCCA AAACTTTGCC AACGGTCAAA CATCTTACAC 21780 

AATCCTTTGG GCACCAGCTC AAAATGGTAT CCAAGCTAAA CTTTTAGAAG CAAGTAAGGT 21840 

AGAAGTGGTA GAAGTACCAT TCCCATCAGA CGAAGGTAAG CCAGCTCTTG AGTACCTTGT 21900 

AAACGGGTTT GCAGTATTCA ACAATAAAGA CGACAAGAAA GTCGCTGCAT CTAAGAAATT 21960 

CATCCAGTTT ATCGCAGATG ACAAGGAGTG GGGACCTAAA GACGTAGTTC GTACAGGTGC 22020 

TTTCCCAGTC CGTACTTCAT TTGGAAAACT TTATGAAGAC AAACGCATGG AAACAATCAG 22080 

CGGCTGGACT CAATACTACT CACCATACTA CAACACTATT GATGGATTTG CTGAAATGAG 22140 

AACACTTTGG TTCCCAATGT TGCAATCTGT ATCAAATGGT GACGAAAAAC CAGCAGATGC 22200 

TTTGAAAGCC TTCACTGAAA AAGCGAACGA AACAATCAAA AAAGCTATGA AACAATAGTC 22260 

CTTAGTTATT CTATAAAAAG TAGTTTTTTA AAGAACCTAA GAGTGTATAC CCCCTTTTCC 22320 

CTCTACACAG ATAGTGTAAG AAAAGGGGGC TTTTGTTTAA AATGTAAGAA ACTGTCACGA 22380 

AATTAAAATG AAGTTCTTAC ATAAGCGAAT CATAAAAAAT TTCATTTTGA TTTTAAAACA 22440 

GTTCAAGAAA GTCAAAAAAT TATTCTATTT GAAAGAGAG6 TGCCGACTGT GAAAGTCAAT 22500 

AAAATCCGTA TGCGGGAAAC AGTGATTTCC TACGCTTTCC TAGCACCAGT ATTATTCTTC 22560 

TTTGTCATCT TTGTGTTGGC TCCGATGGTG ATGGGCTTCA TTACAAGTTT CTTTAACTAC 22620 

TCAATGACTA AATTTGAGTT T6TAGGCTTG GATAACTATA TCCGTATGTT TAAAGATCCT 22680 

GTCTTTACAA AATCTCPGAT TAACACAGTT ATTTTGGTTA TTGGATCTGT ACCAGTTGTT 22740 

GTTCTATTCT CACTCTTTGT AGCATCTCAG ACCTATCATC AAAATGTCAT TGCCAGATCC 22800 

TTCTACCGTT TCGTCTTCTT CCTTCCTGTT 6TAACGGGTA GTGTTGCCGT GACAGTTGTT 22860 
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TGGAAATGGA TTTATGACCC ACTATCAGGG ATTCTAAACT 


TTGTCCTTAA GTCCAGCCAC 


22920 


ATCATCAGCC 


AAAACATTTC 


TTGGTTGGGA 


GATAAAAACT 


GGGCATTGAT GGCGATTATG 


22980 


ATTATTCTCT 


TGACCACTTC 


AGTTGGTCAG 


CCCATCATCC 


TTTATATCGC 


TGCCATGGGG 


23040 


AATATTGACA ATTCACTGGT 


TGAAGCGGCG 


CGTGTTGATG 


GTGCAACTGA 


GTTTCAAGTT 


23100 


TTTTGGAAGA 


TTAAATGGCC 


AAGCCTTCTT 


CCAACAACTC 


TTTATATTGC 


AATCATCACA 


23160 


ACAATTAACT 


CATTCCAGTG 


TTTCGCCTTG 


ATTCAGCTTT 


TGACATCTGG 


TGGTCCAAAC 


23220 


TACTCAACAA 


GTACCTTGAT 


GTACTACCTT 


TACGAAAAAG 


CCTTCCAATT 


GACAGAATAC 


23280 


GGCTATGCCA 


ACACAATTGG 


TGTCTTCTTG 


GCAGTCATGA 


TTGCTATCGT 


AAGCTTTGTT 


23340 


CAATTTAAAO 


TACTTGGAAA 


CGACGTAGAA 


TACTAAAGAA 


AGGAGACAGC 


TATGCAATCT 


23400 


ACAGAAAAAA 


AACCATTAAC 


AGCCTTTACT 


GTTATTTCAA 


CAATCATTTT 


GCTCTTGTTG 


23460 


ACTGTGCTGT 


TCATCTTTCC 


ATTCTACTGG 


ATTTTGACAG 


GGGCATTCAA 


ATCACAACCT 


23520 


GATACAATTG 


TTATTCCTCC 


TCAGTGGTTC 


CCTAAAATGC 


CAACCATGGA 


AAACTTCCAA 


23580 


CAACTCATGG 


TGCAGAACCC 


TGCCTTGCAA 


TGGATGTGGA 


ACTCAGTATT 


TATCTCATTG 


23640 


GTAACCATGT 


TCTTAGTTTG 


TGCAACCTCA 


TCTCTAGCAG 


GTTATGTATT 


GGCTAAAAAA 


23700 


CGTTTCTATG 


GTCAACGCAT 


TCTATTTGCT 


ATCTTTATCG 


CTGCTATGGC 


GCTTCCAAAA 


23760 


CAAGTTGTCC 


TTGTACCATT 


GGTACGTATC 


GTCAACTTCA 


TGGGAATCCA 


TGATACTCTC 


23820 


TGGGCAGTTA 


TCTTGCCTTT 


GATTGGATGG 


CCATTCGGTG 


TCTTCCTCAT 


GAAACAGTTC 


23880 


AGTGAAAATA TCCCTACAGA GTTGCTTGAA TCAGCTAAAA TCGACGGTTG 


TGGTGAGATT 


23940 


CGTACCTTCT 


G6AGTGTAGC 


CTTCCCGATT 


GTGAAACCAG 


GGTTTGCAGC 


CCTTGCAATC 


24000 


TTTACCTTCA TCAATACTTG 


GAATGACTAC TTCATGCAAT TGGTAATGTT 


GACTTCACGT 


24060 


AACAATTTGA 


CCATCTCACT 




ACCATGCAGG 


CTGAAATGGC 


AACCAACTAT 




GGTTTGATTA TGGCAGGAGC 


TGCCCTTGCT 


GCTGTTCCAA 


TCGTCACAGT 


CTTCCTAGTC 


24180 


TTCCAAAAAT 


CCTTCACACA 


GGGTATTACT 


ATGGGAGCGG 


TCAAAGGATA 


ATACTCTGCG 


24240 


AAAATCTCTT 


CAAACTACGT 


CAGCTTCACC 


TTGCCATACT 


TAAGTATTGC 


CTGCGGTTAG 


24300 


CTTCCTAGTT 


TGTTCTTCAA 


TTTTCATTGA 


GTATAGGAAA 


ATCAATCTAT 


CAAGATACAG 


24360 


AAGTATATTT 


TATAGATTTA 


GAGAATATAG 


AGGTTATAAG 


TGTCTACAAA 


ATGGAGGGTA 


24420 


TGCAGTTACT 


TTATGAAGTT 


TTGTCA6ACA 


CTTATAAACT 


TAAGAATGGT 


TTTAGTTAAC 


24480 


TATCAGAAAC 


GAAGGAAAGA GTATGATTTT 


TGACGATTTG 


AAAAACATCA 


CCTTTTACAA 


24540 


AGGGATTCAT 


CCTAATTTAG 


ACAAGGCTAT 


CGACTATCTC 


TACCAACATC 


GTAAGGATTC 


24600 
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TTTCGAATTA 


GGAAAGTATG 


ATATTGATGG AGATAAAGTC TTTCTAGTTG 


TTCAGGAAAA 


24660 


TGTCCTCAAT 


CAAGCTGAAA 


ATGATCAATT TGAGTATCAT AAGAACTATG 


CAGATTTGCA 


24720 


TTTGCTGGTA 


GAAGGACATG 


AATATTCGAG CTACGGTTCA CGTATCAAAG 


ACGAGGCAGT 


24780 


AGCATTCGAC 


GAAGCGAGTG 


ACATTGGCTT TGTTCATTGT CATGAACACT 


ACCCACTCTT 


24840 


GTTGGGTTAT 


CACAATTTTG 


CGATTTTCTT CCCAGGTGAG CCACATCAGC 


CAAATGGTTA 


24900 


TGCAGGCATG 


GAAGAAAAGG 


TTCGAAAATA TCTCTTTAAA ATTTTGATTG 


ATTAAAAATA 


24960 


GGATGAATTG 


TTTTTTTGTA 


AAGCTTTGAT AATACTCTAC CATGAAATTG 


ATCTTTGTGA 


25020 


6GTAGAGAAA 


TGAGAATAAA 


ATATTTAAAA ATTGGTATCT TCTAAGTATG 


CTGCAAGAGC 


25080 


TAGTTTCTTA 


GATGGACAGG 


GGATTACA6T TGATGAGATG GCTTGGATAA 


TTAGGGGCAT 


25140 


TGTGAATGCA 


TTGATTGGTA 


GATACATAAA ATTAGGTACT TATGCGGCTA 


AGTATGGTAT 


25200 


TAGTATGGCA CGCTCGATCT 


TAAGTAGGGT AGCTGCAACT GCAQCAGCAA 


GAGTAGGATT 


25260 


ACTGACCAAG 


ATTTCTGGAT 


GGATTTTACG AGTAGCTGTG AATGTAGCTG 


ATGTATATGG 


25320 


TAATTTTGCC 


AACAATATTG 


CTGCAGCTTG GGATGCATAT GATAAAATTC 


CTAACAATGG 


25380 


TCGTATAAAC 


TTTTAAAATG 


CGAGAATGAA AGCACTTTGT ATTTTTTTAT 


TGAATATGTT 


25440 


AGCTTGGACA GTGCTTGCAA 


TGATAATTCG TGGAGGGCTA GATGGATTTG 


ATAGGCATAC 


25500 


TTGGAGTACT ATTTTAATTG 


CGTCGCTGTT CGGGGTATAT GATTATAAGC 


CCATAGATAA 


25560 


AAATAGAAAA AAGTCCAAAA 


GAAAAAATAG ATTTGTTCAT GGTAGGGACT 


TATGAAAGCT 


25620 


TTACTGACAA AAAAGAAAAC 


AGTTTACAAA GAAAAATGAT GGAGGAGCAA 


ACATGGCACA 


25680 


AAAAGGAGTA AGCCTTATCA 


AGGCAGCATT TGATACAGAT AACTTTCTCA 


TGCGTTTTAG 


25740 


TGAGAAGGTC 


TTGGACATCG 


TGACA6CCAA TCTTCTTTTT GTCGTCTCTT 


GTTTACCCAT 


25800 


CGTGACGATT GGAGTGGCTA 


AAATCAGCCT CTACGAGACC ATGTTOGAAG 


TTAAGAAGAG 


25860 


CAGACGGGTG 


CCTGTTTTTA 


AAATCTATCT AAGATCTTTC AAGCAAAATC 


TGAAACTAGG 


25920 


TCTTCAGCTG 


GGTTTAATGG 


AGTTAGGAAT TGTGTTTCTT ACCCTTTCAG 


ATCTCTATCT 


25980 


TTTCTGGGGT 


CAAACAGCTC 


TGCCCTTCCA ATTGCTGAAA GCCATTTGTT 


TAGGTATTCT 


26040 


GATTTTTCTT ACTATCGTGA 


TGCTGGCTAG TTACCCTATC GCGGCACGTT 


ATGACCTATC 


26100 


TTGGAAAGAA ATTCTTCAAA 


AAGGATTGAT GTTGGCTAGT TTTAACTTTC 


CTTGGTTCTT 


26160 


CCTCATGTTA GCCATTCTTG 


TCCTCATTGT GATGGTTCTT TATCTGTCCG 


CCTTCAGTCT 


26220 


ACTCTTAGGT 


GGCTCAGTCT 


TCCTACTTTT TGGGTTTGGA CTATTGGTCT 


TTATCCAGAC 


26280 


TGCATTGATG GAGAAAATTT 


TCGCAAAATA CCAATAGGAG CTTTATTTCT 


GAAACTACTT 


26340 


TCAAAGGCTC CAAACGCTAT 


TCTATAAGCG AGAAACTAAA ATCG6 




26385 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2716 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS : double 
(O) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



CCTGCCCGCA 


TTGCCCTAGG 


CATTAAGTAA 


ACATATAAAA 


GCATGTGAGA 


GACTGTTGGA 


60 


AAAGCGAGGA 


AATTTCCCCT 


CTTTTCCTCT 


AGTCTCTCCT 


TTCTTTTGCT 


GATTTTATTC 


120 


AAAGAAAATG 


ATATAATAGT 


AGTTATGGAG 


AAAAAGAAAT 


TACGCATCAA 


TATGTTGAGT 


180 


TCAAGTGAGA AAGTAGCAGG 


ACAGGGAGTT 


TCAGGTGCTT 


ACCGTGAATT AGTTCGTCTT 


240 


CTTCACCGTG CTGCCAAGGA 


CCAATTGATT 


GTTACAGAAA 


ATCTTCCAAT 


CGAGGCAGAT 


300 


GTGACTCACT 


TTCATACGAT 


TGATTTTCCC 


TATTATTTAT CAACCTTCCA AAAGAAACGC 


360 


TCAGGGAGAA 


AGATTGGCTA 


TGTGCATTTC 


TTGCCAGCTA 


CACTTGAGGG 


AAGTTTGAAA 


420 


ATTCCATTTT 


TCTTAAAGGG 


AATTGTGAAA 


CGCTATGTAT 


TTTCTTTTTA 


CAACCGGATG 


480 


GAGCACTTGG 


TTGTGGTCAA 


TCCTATGTTT 


ATTGAGGATT 


TGGTAGCAGC 


TGGTATTCCA 


540 


CGTGAAAAAG 


TGACCTATAT 


TCCTAACTTT 


GTCAACAAGG 


AAAAATGGCA 


TCCTCfACCA 


600 


CAAGAAGAGG 


TAGTCAGACT 


GCGCACAGAT 


CTTGGTCTTA 


GTGACAATCA 


GtTTATCGTA 


660 


GTAGGTGCTG 


GGCAAGTTCA 


GAAACGTAAA 


GGGATTGATG 


ACTTTATCCG 


TCTGGCTGAG 


720 


GAATTOCCTC 


AGATTACCTT 


TATCTGGGCT 


GGTGGCTTCT 


CTTTTGGTGG 


TATGACAGAT 


780 


GGTTATGAAC 


ACTATAAGAA AATTATGGAA 


AATCCCCCTA 


AAAATTTGAT 


TTTTCCAGGC 


840 


ATTGTATCGC 


CAGAGCGGAT 


6CGCGAATTG 


TATGCTCTAG 


CGGATCTTTT 


CTTGTTGCCT 


900 


AGTTACAATG 


AGCTCTTTCC 


TATGACTATT 


TTAGAAGCTG CGAGTTGTGA GGCTCCTATT 


960 


ATGTTGCGTG 


ATTTAGATCT 


CTATAAGGTG 


ATTTTGGAGG 


GAAATTATCG 


GGCGACAGCG 


1020 


GGTAGAGAAG 


AGATGAAAGA 


GGCTATTTTG 


GAATATCAAG 


CAAATCCTGC 


TGTCTTAAAA 


1080 


GATCTCAAAG 


AAAAGGCTAA 


GAATATTTCC 


AGAGAGTATT 


CTGAAGAGCA 


TCTGTTACAA 


1140 


ATCTGGTTGG 


ACTTTTATGA 


GAAACAAGCC 


GCTTTAGGGA GAAAGTAAAA 


AGTGAGGTAA 


1200 


TCTATGCGAA TTGGTTTATT 


TACAGATACC 


TATTTTCCTC 


AGGTTTCTGG 


TGTTGCGACC 


1260 


AGTATTCGAA 


CCTTGAAAAC 


AGAACTTGAA 


AAGCA6GGAC 


ATGCTGTTTT 


TATCTTTACG 


1320 


ACGACAGATA AGGATGTCAA 


TCGCTACGAA 


GATTGGCAAA 


TTATCCGCAT 


TCCAAGTGTT 


1380 
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CCTTTCTTTG CTTTTAAGGA TCGTCGCTTT GCCTACCGAG CTTTTAGCAA GGCACTTGAA 1440 

ATTGCTAAAC AGTATCAGCT AGATATTATC CATACTCAGA CAGAATTTTC TCTTGGCCTG 1500 

TTGGGGATTT GGATTGCGCG TGAATTGAAA ATTCCAGTCA TCCATACCTA TCACACCCAG 1560 

TATGAAGACT ATGTCCATTA TATTGCTAAG GGGATGTTGA TCCGGCCGAG TATGGTCAAG 1620 

TATCTGGTTA GAGGTTTCCT GCATGATGTG GATGGGGTTA TTTGCCCTAG TGAGATTGTC 1680 

CGTGACTTGC TATCTGATTA TAAGGTCAAG GTTGAAAAAC GGGTCATTCC TACTGGGATT 1740 

GAATTAGCCA AGTTTGAGCG TCCGGAAATC AAGCAGGAAA ATTTGAAAGA ACTGCGTAGT 1800 

AAACTAGGGA TTCAAGATGG TGAAAAGACG TTGCTTAGTC TTTCGAGAAT CTCCTATGAA 1860 

AAAAATATTC AAGCAGTTTT AGCAGCCTTT GCTGATGTTC TGAAAGAGGA AGACAAGGTT 1920 

AAACTGGTAG TAGCTGGGGA TGGCCCTTAT CTGAATGACC TCAAAGAGCA AGCCCAGAAC 1980 

CTAGAGATTC AAGACTCAGT CATCTTTACA GGGATGATTG CTCCTAGTGA GACGGCTCTT 2040 

TACTATAAAG CGGCX3GATTT CTTCATTTCX3 GCATCGACAA GCGAAACGCA AGGTTTGACC 2100 

TACTTGGAAA GCTTAGCCAG TGGAACACCT GTCATTGCTC ACGGAAATCC TTATTTGAAC 2160 

AACCTCATCA GTGATAAAAT GTTTGGAACX: TTGTACTATG GAGAACATGA TTTGGCTGGT 2220 

GCTATTTTGG AAGCCCTGAT TGCAACACCA GACATGAACG AGCATACCTT ATCAGAGAAA 2280 

TTGTATGAGA TTTCAGCTGA GAACTTTGGG AAACGAGTGC ATGAGTTTTA TCTGGATGCC 2340 

ATTATTTCAA ATAACTTCCA GAAAGATTTG 6CTAAAGATG ATACGGTCAG TCAGCGTATC 2400 

TTTAAGACAG TTTTGTATCT TCAGCAACAG GTGGTTGCTG TACCTGTAAA AGGATCTAGA 2460 

CGCATGTTGA AGGCTTCAAA AACACAGTTG ATCAGTATGA GAGACTATTG GAAAGACCAT 2520 

GAAGAATAGA AAGAGGAACA GCTATGAAAA AAACAATTAA TGAGAAGCGG TCGTGATAAA 2580 

AAGATTGCGG GTGTTTGTGC TGGG6TGGCC CATTATCTGG ATATGGATCC GACTATCGTT 2640 

CAAGTCATTT GGGGTGTTCT TACTTGCTGT TACGGAGCTG GAATTGTAGC TTACATTATT 2700 

TTATGGATTA TCGCGA " 2716 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUEKCE CHTOIACTERISTICS: 

<A) LENGTH: 13926 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CTTTGGTTTT GCCTTATTCA AGACATGAGG GCCATCAGGA ATGATCTGAA ACTGCGAATC 60 
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TGTTAACAGT CTATGGAGAG CTTTCATAGA ACTAAGATTC GGTTTATCTT TGCTGCCACA 120 

AATTAGTAAG GTTGGATAAG GGTAAGTTCC TGCTATATCC GTTAAATCAA GTGTCTTCAA 180 

CTCCTCAGAA ACTCCGACCA TAAGAGTCTT GTCTGCTCCC TGTTTTTCAA ATACTCTTTT 240 

GGGAAGTAGT TTAAAAATCA GCAATTGAAG ATAAAATAGG ATATTCCCTG CTAATTTAAG 300 

CGGGCATCCT GACAGAATCA AAGCTCGAAG ATTTGGTAAA TCGTAACTGG AAAGTTCTAG 360 

TGTCAGGGCA GCACCTAAGG ACAATCCAAT CAAAACAAAA GGTTCTGTCT CTTGAGCTAG 420 

GTGCTGATAA ACTCGCTCTT TAGCTTGTTG ATAGTTACTA ACTCCAGAAG GAAATAACTC 480 

GATAGCCTCA GAAGGATAAT CTGTCAGTAG ATTCCGAACT TCTTTCCAAG ACTCTGCTGA 540 

CTGCCCTAAC CCATGCAAAA ATATTAATTT CATCTAGTTC TCCTCAAGGC TTAATTCATA 600 

CAAGCCTCTC ACTGCATTAC AGCCGTAAAT AGCTTCTGCT TGGGTTAAAT CTGCCAAGGT 660 

CAAGACTTTC TCTTCTACCT GTCCTGTTTC TAGCAAATGC TGACGGTAAA TTCCTGGCAA 720 

GATTCCAAGT CGGATAGGCG GTGTGTAGAG TTTTCCAGCG ATTTTCAGAA CCAAATTTCC 780 

TATAGAGGTT TCAAGCAGTT CTCCTGACTT ATTGTGGTAA ATCTTCTCTT GTTCTCCTAG 840 

GCTCAAATGC GGTCGGTGAG TGGTTTTAAA GTAGGTAAAG GATTGATTCA AAGCAGCTTC 900 

CTGAAGACAG ACTTGGGCCT GACAAAAGCT TGTACTGAGA GGGGTTAATA CTTGACGATT 960 

GACTTCTATC TCTCCAGATT TGCTAAGGCT GATTCGCAAG CGGTAATCTC GATTAGCTTC 1020 

ACAATCCTGA CACTCTTCCT CAATCTTGTG TCCCAAGTCT TCTGCATCAA AAGGAAAAGC 1080 

AAAATAACGA CTAGCTTTTC TCAGCCTTTC CAGATGTTGT TCTTCAAACA TCAGTTGTTT 1140 

TTGGCTGATT TTTCCAGTTG TAATTAATTG GAAGCGAGCT TGTTTACGAT AGAGAACTGC 1200 

TGCCTTTTGA TGAACCTCTC GGTATTCAGA TTCCCATGTG CTATCCCAAG TAATCCCTCC 1260 

GCCAACTCCA TAAATGGCTT GACCTTTGTG AAGTTGAATG GTACGAATGG CCACATTAAA 1320 

AATCCGTCGT CCATTTGGAA GCAAGAGACC AATCGTTCCA CAGTAGACTC CACGCGGTTG 1380 

AGGCTCCAAG TCCTTGATAA TCTCCATTGT CGCAATTTTC GGTGCACCCG TTATGGAACC 1440 

ACAAGGAAAG AGTGAGCGGA AGATTTCAAC AAGGTCCACA TCCTCTCGCA ACTGACTCTT 1500 

GATGGTCGAA GTCATCTGCC AAACAGTTGA ATACTGCTCT ACCTGACACA GACGCTCCAC 1560 

GTGCTCGCTC CCAACTTCAG AAATACGGTT CATATCATTG CGCAAGAGGT CCACAATCAT 1620 

CATATTTTCA GAGCGATTTT TGGGATCCTG TTCCAACCAA CTGGCCTGTT CAAGATCTTC 1680 

TTGGTCAGTT ACCCCACGCT GAGTCGTCCC CTTCATTGGT CGTGTTGTCA ACTCGCGATC 1740 

ATTTTGCTCA AAAAAGAGCT CTGGGCTCAT GGAAATCACT GTCATCTCGT CATGTTCCAC 1800 
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ATAGGCATTG TAGCCCGCCT CCTGCTCTAC CACCATACGA TTGTAGATGG CAAAAGGATT 1860 

GGCATTTAAC TTTTGCTTAA GTTGGACGGT GTAGTTGACC TGATAGGTAT CTCCCTGCCG 1920 

TAAATGATGG TGAATTTGGG CAATGGCCTT TTCATAGTCT GCTGCAGACG TTACTTCCTG 1980 

CCAATTTGAG GGCAAATCAA TATCCTCATA AGTCAGAGGA ATAGGGGAAG TTTCTACGAT 2040 

ATCATGAACA GTAAAGTAAA GCAGGTACTC TCCCAGTAGG GGATCCTTGT GAACTGCTAA 2100 

TTTTTCCTCA AAAGCAGGTG CAGCCTCGTA GCTGACATAC CCCACCACAT AATAACCTTG 2160 

CTCTTGGTAG CTTTCCACTT GTGCCAGCAA ATCTGCCACT TCTTCTACAT TTCTCGTTTT 2220 

CAACTCTTTA ATAGGCTGGG TAAAGGTATA TCTCTCCCCC AAAGTCCTAA AATCAATCAC 2280 

TGTTTTTCTA TGCATACCTT AAGTATAGCA TAAAATAAGA AAACCCTCAT CCGCAAAGCA 2340 

GATGAGAGAT TTCAATTATT TAAAGATTGA AGTTTTAAAG CTATTTGTTT GTTGAAGAAG 2400 

TTTCTTATAA ACAGCTTCTT TTAATTTAAC TGTATTATTC ATAGATACTG TTTTATTACC 2460 

GTTTGCTTCT TGTTTAAGAG TTTCGGCATC TTTTTTAACA GCTTCTTTAA ACAATGTCAG 2520 

TAAATCATCG TATGATGAAA CGGAAGAACC ATTTACTTCG AATGTTGTTA ATCCTTTCGT 2580 

TGCTTTATCT TTAACTTCTT TGAAGTAAGC TTTTTTAAAT TCTTCAATAG TATTAAATGT 2640 

ATTGTTAGAT ATTTTCTTGA TAATATATTC ATCACTTAGA ACAGACTCAC CATCTGTTTT 2700 

AGATTGTTGT TTATATTTAT TTGAAGCATA ACCTAAGAAC CCATTTTCGT ATCCGTAGTA 2760 

ACCCCATAAT CTAAAAGCAT TATGTTTGAA TGAAACAGCT CCAGGAGCAC CTTTACTAGT 2820 

ATTACCTCCG TAGATACCGG TCATCATTCT AACACCTACA TAAGGTGATT GATCGTTATA 2880 

GCTAATTGCT TCGGGTTTAT AGATACCATT ACCTGGATTG CGATTAGTCA TTAATTGTTG 2940 

ATCAACTAAA TCATTAACAG ATTGAATATT TAATTCATTT TTCTCTTCTT GACTTAGATT 3000 

TCGAATTTTA TCCCATTGAT TTAATTTATT GTTATCACGG TATTCTCTAT CTATTTTTTT 3060 

GAACCATGCA CTATTTAAAT CTTTATTTTG TTGAGAAATC ACAGATTCAG CCTCAATTTC 3120 

ATCAAGAAGA GTTAAAGTGT CATTATAACC CTTCATATAT CTATTAATAT CTTCTCGTGT 3180 

TTTTAGAGTT TTTGGATCTG TAATATACCA CTGATTCCCA TCATTTTTGC GTTTAAATAC 3240 

CATATTAATA OCTAAAGAAC CAAACTCATC AAATCCACTA CCAGTAACAG GAGTTTGTAG 3300 

CATACCCTGA GCATATGCTT CAGCATCAGT ACCTTCACGG TGTCCAAAGC CACCTAAGTA 3360 

AATCGCACGG TCGTTGACGT GTGTTGTTTC ATGTGTGTAA ACTGAAATAC CGTATTCACC 3420 

AACCATTTCT AAATGAACAT ATTTTACATC AGTTCTAATA TCATCAGAGT TAGGATATAT 3480 

AGCAGCATAA GCTCCTGTTC CATTATAATT ATAATACTTA TCCATAGGAC CAAAGAATTC 3540 

TCTAAGAGGA GTATATACTT TGTCGGTATT ATAGCGGCCA TATTTTTCAA CCCATCCACC 3600 
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AGGAGCXSTTA TAACCTTCCC AAATAGGAAT AACAGCATCT CTTAGTAGTC GTTGTTTAAC 3660 

GTTATCAGAC GCTAGACGAT ACCAGAAATC ATAATAGTTT CTATAACCAT CTGCAGCTTT 3720 

GTTAACGATA TCTTTAATAT CTTCTAATGA TTTTTTACCT AATCGCTCTG CACTACCAAA 3780 

GGCAATTGCA TTATAATTTG AAATTAAATA AAGATGTGCT TTATCAATAT TCAGTAGTGG 3840 

GAGTATAGTA TTTCTAAGGT GACTTCGTTT TAAATTATCG AATGCACGAT GTTTAGAATT 3900 

TTTAATTTCT TCGACCTCAG AAGCGCGTTC TGCGATGTAG ACATGGTCTT CTGTAGCATC 3960 

AATAAACCAA TCGTTCATAT TGTCTATATT TGTGAACAAT TGTCTATTAT AATTTAAAAA 4020 

TGCATCTAAA TTACCTGATT TAGTATATTT AGCCAATACT TGACCGAATG CGTCGAATGT 4080 

ACGTGAACCT TTAATGTTGT TCTCTTTAGA ACCGATTTCA ATTAATCTGT CTAATACGCT 4140 

AACTTTTTCA CCATAGAAAT CTGGTTTGAA TAGCATTAAT TCTTTAATAT TAACATCACC 4200 

AAATTTAACT CCATAGTAAC GATTTAGGTA AGTTAAACCT AGTAATAAAG CTGCTTTGTT 4260 

TTTCTCGACT TTATCACGAA TCATTTGACG AGCAGCTGGA GAATCATTTA GTTGATGTTC 4320 

TTCGTTTTGA ACTAATTTTG TGATTAGGTT TGTTAAGTTT TCTTTAACAT CTGTGAAGCT 4380 

TTCTTCTAAA TATAAATCTT TGATTGCATT AACTCTATAG TCACCTAATC GATTTAGATG 4440 

CTGATACATC GTTTGAGACT GAAGCTCTAC TGATTCTAAA ATAGATTTTA TATCATTAAC 4500 

AAGAGTAGTG TTATCTTTTT GAACGATATT AGGTGTATAT TTAATTCCTA AGTCAGTTAT 4560 

AGTATATTCT TTTACATTAC TTAAACCTTC ACTGCTAGAA GACAAGTTAA AGTAATCTTT 4 620 

TGTACCGTCC GCATAGTGAA CAATAATTTT ATTAGCTTCA TCTAGGTTTG TGATAAACTC 4680 

ATTGTTGTTC ATC6CGGTAA CAGAAAGAAC TTCTTTAGTA TTTAGATGGT GTTCTTTATT 4740 

TAATTTATTA CCTTGATATA CAATATAATC TTTATTGTAG AATGGTATTA ATTTTTCAAG 4800 

ATTTTTATAG GCTTGGTTAT ATTCAGCGTT ATAATCTTGA ATACTAGAAT AGGCTTTTTC 4860 

TTCATTAAGT TTTGCAAGAG GAGATAGATC ACTTTCTAAT TTATCAGCAG TAATATTGAA 4920 

AGTAGTAACT TTAGCATCAG CTTGTTCTTT AGTTAATTTA GTAAATGTTT TAGATTTCCT 4980 

AAATGATCTA TTACCTGACG AATATCCCTC TACCGCATAT AAATCTTTTA TATGAGCACT 5040 

AGCATAATCA GAATCATCAA CGTCGTTAGA GCCGAATAAC TCCTCTCCAC GGAT/^TCTT 5100 

AGCATAGCTG ACAGAATTAC TTACCGTACC TACAGGCCAA GTCTTACTTG CTATTGCTCC 5160 

AACTTCTACT GGATTTGAAA CATCTATTTT ACCTTTTACA ACCGACTCAG TTAGGAGAGC 5220 

TTTTGTACCA ATAAGATGGT CTAGAGTTAA TCCATAATCT ACTTTAGGAA CTAACAAGCT 5280 

GGCGCGTGTT TTGTTTCCTG TAATAGTAGC ATCAACATAT GCTTTTCTAA CAATTCCTCT 5340 
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ATAGTTTGTA CCTGCAATTC CCCCTGTATG AGAGCCATTT CCACTTGTAG AGTGTAGTTT 5400 

GCCAAAGAAA GCAACATTTT CAATACGAGT TCCATCATTC ATATTATTTA CAAATCCAGC 5460 

AACATTATTA CGACCTGAAA GTGTGCCTGT AATTTTGACA TTTGTAATAA CTGAAGAACC 5520 

TTTCATAGTA TTGGCTAATG ATGCAATATT ATCTTGACCA GAACGTTCTA TCTCTACATT 5580 

TTCAAAATTC ACATTATTTA TCGTTGCGTT TGTTATCACA TTAAATAATG GATGTTCCAA 5640 

TTCAGTAATA GCAAATTGTT TTCCTTCAGA ACTTAAAAGT TTTCCTGTGA ATTCTTTAGT 5700 

GATATATGAT TTTCCATTAG GAACAACATT TCTAGCGCTC ATTGATTGTC CCAGACX3ATA 5760 

TTCTTTTGAA GGATCGTTTT GAATAGCTTC CACTAATTCT TTGAAATTAT AATATACATT 5820 

ATCTTCGTGG ACTTTAGGTT TTTCAATATA GTGAACGTAT TCTTCTTCAA ATTTATTATC 5880 

AGCAGTTCTA GAGACTAAAT TGTCTGCGAT TGCTGTAACT TTATATACAG GTGTTCCGTT 5940 

AACCGTAGTT TCTTCTATAT TTTTAACAGC TAGTAATGTA GTTTTCTGAT TATTTGAAGT 6000 

TATTTTTAAA TAATAATTGC TCTTATCATC AGGAATAGTT GTTATCAGTG ATTCATTAGT 6060 

TTCTTTTCCA TTTTCGTATT TGATTAAATC TGTACXSTTTA ATATTTTTAA GCTCAACTTT 6120 

TTTAAGATCT AATTGAATAT TTTGATTTTC TAGAGTTTCA GTTTCTTCAC CGTTACCTCT 6180 

GTCGTAAATC ATAGTTGTAG ATAGGGTGTA TTCTTTGTAG TACTCTAGGT TCTTAAATGC 6240 

AGCGCTTATA GTTTCTGTTG TTACCTTGTC ATCTGTAAGG ACTACAGTAT TAATAACTTC 6300 

TTCTCCTTTT TTCAATTCAG CTGTGATTGA TTTGATTTTT GTTTTGTTTT GATTTTCTAG 6360 

AGTATACTTA GCAACAGCTT CACGTTCCAA TATTTTCTTA TCGGTACTAG TCAATGTTAA 6420 

TATTGGCTTT TCAGATAATT CAACCAATTT TTCAATAGTT GCAGTTAATT TTTCAACAGC 6480 

TTCGTTAACT TCACTTTGTT TAGCATCTGT ATTAGCTGCA ACTTTTTCAG CCTTTGTAAC 6540 

TTCAGTTTGG AGGTTTTGCC AACTTCTATC ACTGTAATGT TCTTTTACCT TTGTTTTTGC 6600 

ATCTGCAATC GTATTGTTTA ATTCAGTTTT ATCAACGTTT AGAGCGTCAA TAGCCGTTTT 6660 

AAGTTTATTT GTCTCGCTAT TTACCTCAGG CTGTTTTACA GGCTCTGAAG CATAGACACC 6720 

TTTTGCAGTT TCTAAAACAG GTCCAAGAGC ATTGTAACTT GCTGTAGAAT AATCAGTAGG 6780 

AGAAACTGAA CTAGCTTTAT CAATTTGATT ATTTAACTCA CTTTTATCAA CTGGTTCTTT 6840 

AGTACCAATA CCCTTTATTT TATCTTCTGG TTTCGGTGTT TCCTCTACAG CCTTCTCTTC 6900 

TTCAGGAACT TCTGGTTGCT TTTCTGGCTC AACTGGTGCC GTTGGTGCCT GTTCGTCTTC 6960 

TCTTGGCGCG ACTGGTTCAC CTGCTTGTTC AACTTTTGGT TCCTCTGTTG GTTCTGTTTG 7020 

TTTTTCTACA 6CAGGGGTTT CAACTTTTGG TTGTTCAATA GATTGATTAA CAGTCTCCTC 7080 

TTTTGGTTCT ACAGTTTCTT CAGCCTTGGT ATCTGGAGTT GACTCTTCTT GTTTCGGTGT 7140 
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7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
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GTTGATCGAC ATTTTGAAGA 


CCAACTCCCC CACGTTTGAG 


TTGACTTTGA 


CTACTATCAC 


8940 


CAGCATCTTG GAAGCCAACG 


CCATCATCCT CAATACGGAT 


GACCAATCCC 


GAATCCTGTT 


9000 


TCTGGACAGA AAGTTTAATA TGGCCCTGAC CTTCCTTTTC 


CTTAATGCCA TGGTAAAGAG 


9060 


CATTTTCTAC AAGGGGTTGT 


AGGACCAGCT TGGGTAAGAC 


TAAATTATCA 


AAGGCAACAT 


9120 


TTTCATTAAT TTCGTATTCC 


AGCTTATCTC CATAGCGTTG 


TTTCTGGATA 


AAGAGATACT 


9180 


GGCGGACATG ATTGATTTCG 


TCAGAGAGAC AAATCAAGTC 


CTTGCCTTGA 


TTGAGCGCCA 


9240 


AGCGGAAATA GGTTGCCAAG 


GACTTGGTCA CCTGCACCAC 


TCGCTGACTA 


TCATGAAATT 


9300 


CAGCCATCCA 6ATGATGGTG 


TCCAAAGTGT TATAGAGGAA 


ATGTGGATTA 


ATCTGGCTCG 


9360 


AAAGGGCTTG AAGTT6GTAC 


TGAC6GGTCG TTTCTTCCTG 


GCTACGAATA 


GCTACCATCA 


9420 


ACTGATCAAT CTGATCCAAC 


ATAGCATTAA ATTGGCGAGT 


TACTTCTCTC 


AGTTCATAGG 


9480 


CACCAACTTC CTTGGCACGA AGATTTTGAG CACCAGAAGC AATTTCCAAC 


ATGGTTTCTC 


9540 


TCAAATCCTT CAAAGGAGCA ATCCAGCGTT TAAGACTGAA CCACACTAAG CAGAGACAGA 


9600 


CAAGAAGAGA TGTGACACTG 


GCCCCAAGCA AGGTCCACAA 


GAGCTGACTC 


CGAACCTGGT 


9660 


CTAACTTTTC CAATGATGAC 


ACGCCAAGCA CCGTCCAATC 


AGTTCCTGCA 


ATCTTCTCTT 


9720 


GACTGACGTA GGATTTGTGA 


CCAGGAGTAT AACCCTGACC 


TGTATCGATG 


TAGGGTTTCA 


9780 


TAGCCTCCAT TTTGCTAGAC 


GAACTATAAA CTGTGTGTTG 


AGGATGGTAG 


ACAAATTCAT 


9840 


GGTTTTCATT GATAATGAAG 


GCAAAGCCCT GCTGCCCCAA 


CTGGAGTTGA 


TTGAGATAGG 


9900 


CTTCCAGAGT TTCATAAGAA 


ATATCCAAAC GAAGCACACC 


AAGATTGGCT 


CCCTTTGCAT 


9960 


CAACAAGTTC TTGAGTGACA 


GAAATGACCC ACTGACTATC 


TGATTTACGA 


GCTGGAGTCA 


10020 


AAACAGGCAT AGCTCCCTGA TGAATGGCCT TTTGGTACCA ATCCTCAGCC ATCATATCAG 


10080 


AGGAAGTTTT CATCTGCACA CTGTCATCTG TAGAAATGAC CTGACCAGAT TTGGTCACCA 


10140 


GCACAACAGT TTTCAAGTCC 


TTATCTGACT TCAAGATGGT 


CAAAAACAAA 


TCTCGGATTC 


10200 


CCTCGACCTT GTCTTGACTG 


GGATTCTCAG CATAGGCCAG 


AACATCCGTC 


TGCTGGGTCA 


10260 


AACCAGTCGA GGTGGTTTCT 


AGTTTTTTGA TATAAGACTG 


AATAAAGTGG 


CTAGTCTGGC 


10320 


TGATGGTCGT TTGGCTGTTG 


CCCTCAATGG TGGCCTCAAT 


GGCTGAAGAA CTTGATTGAT 


10380 


AGTAGAAAGT TCCAACCAGA GCTAGGAGAA TGAGAAAGAC 


CAGAAAGATG 


GAAATAACCA 


10440 


TTCTAACTAA AA6AGAAGAA 


CGCTTCATCG GTCTTCTCCC 


TTCTTAAACT 


GACGAGGTGT 


10500 


CACACCTGCA ATCTGCTTAA AACGTTGGGT AAAATAGTTC 


ATATCTTCAA AACCAACCTT 


10560 


CTCTGCGATC TCATAAATCT 


TCAGATCTGT AGTTAAAAGC 


AAGAGCTTGG 


CTTGTTTAAC 


10620 


ACGTTCTCTC ACCAGATAAT 


CCTGAAAAGG CAA6CCCAAC 


TCTTTCTTAA TCAAGGAACT 


10680 
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CAGATAGGTC GGACTAAAAC CTAAGTCACT GGCTAAAGAC TTTAAACTAA ATTGGCTATC 10740 

AGCCAGATGA GACTGGATTT TCTGGGCCAT GTTTCCTTCA AACCTATTAG TCAATAAATC 10800 

TTGTAACTGC TCTTCTTTCT CTTCCTTGTC TAGTTTTTGT TTGATTTTCC CCAACATTTC 10860 

CTCAATATCC TGACGAGAAA AGGGTTTGAG CAGGTAGTCG TCCACACCTA GTTTGACAGC 10920 

AGACAAGGCA TAATCAAAAT CATCGTAACC TGTTAAAAAG ACCAAATGAA CCTGAGGATA 10980 

GGTTTCTCGT ACCAGACTGG CCAACTGGAT GCCATTTAGA TGAGGCATGT TGATATCGGT 11040 

TAAAATGATA TCTGGCACCT GCTTTTGGAT CAATTCCCAA GCCTGCCTTC CATTTTCAGC 11100 

CTGACCGATG ATTTCCATAT CGTAGGCTGC TACATTGACC AGTTTAGTCA AACCTTGTCT 11160 

TACCAGATAT TCATCTTCTA CGATTAAGAT TGTGTAGGTC ATGCTCTGCT CCTTTACXTAC 11220 

TTACTAGTAT CAGTATAGCA AAATTCTCCT CTAACTGCTT AGGAAAGACC TCTTATACTC 11280 

AATAAAAATC AAAAAGTAAA CTA6GAAGAT AGCCACAGGT TTCTCAAAGT ACTGCTTTGA 11340 

GGTTGTAAAT AAAACTGACG AAGTCGACTC AAAGTATAGC TTTGAGGTTG TAGATAAAAC 11400 

TGACGAAGTC GATAACCCTA CATACGGTAA GGCGACGCTG ACX3TGGTTTG AAGAGATTTT 11460 

CGAAGAGTAT TAATCAACAT AATCTAGTAA ATAAGCGTAc CTTTTTCTTC CATTTGGTCT 11520 

TTGGGAATAA AGCGGATAGA GAGGCTATTG ATACAGTAAC GTAAGCCGCC CTTGTCCTGT 11580 

GGACCATCCG TAAAGACATG CCCAAGGTGA GAATCTCCTA CTCGGCTCCG CACTTCCATA 11640 

CGCGTCATAT TGTAGGACTT ATCTTCCTTG TAGGTGACAA CATCTGGACT GATGGGTTGG 11700 

GTAAAACTAG GCCAGCCACA ACCAGACTCA AATTTGTCTT TTGATGAAAA GAGA6GTTCC 11760 

CCAGTTGCTA TATCCACATA GATACCGGAT TCAAATTTAT CCCAGTAACG GTTTGAGAAA 11820 

GCTCGTTCTG TTTGATTTTC CTGGGTAACT GCATACTCCT CAGGTGACAG GGTCTTTTTC 11880 

AATTCCTCAT CACTTGGTTT TGGATATTTG CTGGCATCAA TGACAGGATA GGCCGCCTGA 11940 

TTAACATTGA TATGGCAGTA GCCATTTGGA TTTTTCTTGA GATAGTCTTG ATGGTAATCC 12000 

TCAGCCACCA CAAAATTCTT CAAGTTTTCC TTTTCAACTG CTAGAGGTTG ATCX3TATTTC 12060 

TTAGCCACCT CATCAAAGAC TTGGTTAATC ACTTCCAAAT CCTTGTCATC TGTGTAATAA 12120 

ACACCAGTAC GGTACTGGGT CCCCACATCA TTTCCTTGTT TATTTTTGCT GGTTGGATTG 12180 

ATAATGCGGA AATAGTGAAG CAGGATTTCC TTGAGAGAAA TTTGCTTGGC ATCATAGGTG 12240 

ACATGGACGG TTTCTGCATG ACCTGTTTGG TTAATCAATT CGTACTTGGT TGTTTCTCCT 12300 

CTACCATTTO CATAGCCTGA AACGGCATCC GTCACCCXTGG GAACACGTGA GAAATATTCC 12360 

TCCACTCCCC AGAAACAACC TCCAGCTAGA TAAATTTCGT GCAAGTCTGC GTCTTTACTA 12420 
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ATTTCTGTTT 


TTTTCACTGC 


TTTTCCTCCT 


TGGCTAACTG 


CCGCCTTTTC 


AATTTGCGAG 


12480 


GCATCTGTCT 


GCCCTGCATT 


TCGTATCAAT 


AGAACATAGA 


AACCGGTTAT 


GGCTAGAAAA 


12540 


AATACTCCTA GCAACAAGAA 


GATTTTTAAC 


TTATCATTCA 


TAAGACGCCT 


CCTAGGCTAA 


12600 


TTCCTTCAAA 


GTTTGCAAAA 


TTGCATCTTT 


TTCCATGAAT 


CCTGGATGTG 


TTTTGACCAG 


12660 


CTTGCCTTCT 


TTGTCTATAA 


AGGCTTGGGT 


TGGGTAAGAA 


CGGACACCAT 


AAGTTTCCAA 


12720 


AAGTTTGCCT 


GATGGGTCAA 


CTAGGACTGG 


GAGATTTTTA 


TAATCCAATC 


CCTTATACCA 


12780 


ATTCTTAAAG 


TCCGCTTCAG 


ATTGCTCTCC 


CTTATGTCCT 


GGTGACACTA 


CTGTCAAGAC 


12840 


CACATAGTCA 


TCACCAGCTT 


CTTTAGCAAT 


CTCATCCGTA 


TCTGGAAGAC 


TAGCCAGACA 


12900 


GATGGAACAC 


CAAGAAGCCC 


AGAATTTGAG 


ATAGACTTTC 


TTGCCCTTGT 


AATCAGATAA 


12960 


ACGGTAGGTC 


TTGCCATCTA 


CTCCCATCAA 


TTCAAAATCA 


GCCACCTCTT 


TCCCTTTAGC 


13020 


TGCGCTTGTT TTACTAGCTG 


TCTGCTCCGT 


CTTCATTTCA 


TCTTTCGTrP 


GGTGTTCACT 


13080 


AGTCACGGAC 


TTGCCTGAAC 


AAGCCGTCAA ACAAAGGAGC 


gaacx:tgctc 


CAAGAACACA 


13140 


TGTTTGCCAT 


TTTTTCATAT 


TGATATTCCT 


TTCCATTTTA 


TTCAAATAAT 


TGACTTAAAA 


13200 


TTGAAGCATT 


TCCAAACAGA 


ACCAAGAAGC 


CCATCACAAT 


aatgagaaaa 


CCACCCACTT 


13260 


TTTTGAGGAT 


TCCGAGATAG 


GGATGAAGTT 


TTCGGAAATG 


TTTCAAAACA 


TAACTAGAGG 


13320 


TCAGAGCTAG AAGCAAGAAT 


GGTAGCGCCA 


AGCCCAGCGT 


ATACACCAAC 


ATGAGACCAG 


13380 


CTCCCTGCCA 


AGCTCCTGAA 


CCACCTGAAG 


CCGCCAAGGC 


CAAAACAGAC 


CCCAGAACCG 


13440 


GCCCCACGCA 


AGGCGTCCAA 


GCAAAACTAA AGGTCAAGCC 


CAATAAAAAT 


GCCTGACTAT 


13500 


AGCCCTTACC 


ATTTTGCCCC 


TGTCCTTGCA 


GTTGTAGCCT 


CTTTTCCTTA 


TAAAGCCCCT 


13560 


TAAAGTGTAG 


AATCTCCATT 


TGGTGCAAAC 


CAAGAAGGAT 


AATAATTGCC 


CCAGTAAGAT 


13620 


ATTGGAACCA 


AGAA6CATAA 


AGCAAATCGC 


CTAAAAAACC 


AGCTCCATAG 


CCCAACAAAA 


13680 


TAAATATAAA 


GGAAATTCCT 


GCTATAAAGG 


CCAGAGTTCG 


TAATAAACTA 


GTAACTGAGA 


13740 


TTGAAAATTT 


GCCGCTAGAA 


GCCTGAGCAC 


CATCCTTATC 


ATCTAGTAAC 


ACTCCTGTAT 


13800 


AGACCGGTAA 


CAAAGGTAAG 


ATACAAGGAG 


AAAAGAAGGA 


TAGAATCCCT 


GCCAAAAAGA 


13860 


CACTTAGAAA 


AAAGAAAATA 


TGACCCATAA 


AGTTCCTCCT 


ATCATTTTAT 


TGATAGATTT 


13920 


ATTATA 












13926 


(2> INFORMATION FOR SEO ID NO: 6: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20199 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCB DESCRIPTION: SEQ ID NO: 6: 

CCCAGCAGAA AAATGGCATT TGGAGATAAT GGAAATCGTA AAAAAACTAT GTTTGAGAAA 60 

ATAACCTTGT TTATCX3TGAT TATCATGCTA GTAGCAAGTT TATTGGGAAT TTTTGCAACT 120 

GCAATTGGTG CCCTCAGTAA TCTATAAAAT AGATTCAAGA AAATTTAGTG ACTGQGATTT 180 

CCCAGCCCTT TTTTAAAGTG AGAAGAAATA ATGAGTATGT TTTTAGATAC AGCTAAGATT 240 

AAGGTCAAGG CTGGTAATGG TGGCGATGGT ATGGTTGCCT TTCGTCGTGA AAAATATGTC 300 

CCTAATGGAG GCCCTTGGGG TGGTGATGGT GGTCGTGGAG GCAATGTGGT CTTCGTTGTA 360 

GACGAAGGAC TACGTACCTT GATGGATTTC CGCTACAATC GTCATTTCAA GGCTGATTCT 420 

GGTGAAAAAG GGATGACCAA AGGGATGCAT GGICGTGGTG CTGAGGACCT TAGAGTTCGA 480 

GTACCACAAG GTACGACTGT TCGTGATGCG GAGACTG6CA AGGTTTTAAC AGATTTGATT 540 

GAACATGGGC AAGAATTTAT CGTTGCCCAC GGTGGTCGTG GTGGACGTGG AAATATTCGT 600 

TTCGCGACAC CAAAAAATCC TGCACCGGAA ATCTCTGAAA ATGGAGAACC AGGTCAG6AA 660 

CGTGAGTTAC AATTGGAACT AAAAATCTTG GCAGATGTCG GTTTAGTAGG ATTCCCATCT 720 

GTAGGGAAGT CAACACTTTT AAGTGTTATT ACCTCAGCTA AGCCTAAAAT TGGTGCCTAC 780 

CACTTTACCA CTATTGTACC AAATTTAGGT ATGGTTCGCA CCCAATCAGG TGAATCCTTT 840 

GCAGTAGCCG ACTTGCCAGG TTTGATTGAA GGGGCTAGTC AAGGTGTTGG TTTGGGAACT 900 

CAGTTCCTCC GTCACATCGA GCGTACACGT GTTATCCTTC ACATCATTGA TATGTCAGCT 960 

AGCGAGGGCC GTGATCCATA TGAGGACTAC CTAGCTATCA ATAAAGAGCT GGAGTCTTAC 1020 

AATCTTCGCC TCATGGAGCG TCCACAGATP ATTGTAGCTA ATAAGATGGA CATGCCTGAG 1080 

■ AGTCAGGAAA ATCTTCAAGA CTTTAAGAAA AAATTGGCTG AAAATTATGA TGAATTTGAA 1140 

GAGTTACCAG CTATCTTCCC AATTTCTGGA TTGACCAAGC AAGGTCTGGC AACACTTTTA 1200 

GATGCTACAG CTGAATTGTT AGACAAGACA CX^VGAATTTT TGCTCTTACGA CGAGTCCX3AT 1260 

ATGGAAGAAG AAGCTTACTA TGGATTTGAC GAAGAAGAAA AAGCCTTTGA AATTAGTCGT 1320 

GATGACGATG CGACATGGGT ACTTTCTGGT GAAAAACTCA TGAAACTCTT TAATATGACC 1380 

AACTTTGATC -GTGATGAATC TGTCATGAAA TTTGCCCGTC AGCTTCGTGG TATGGGGGTT 1440 

GATGAAGCCC TTCGTGCGCXJ TGGAGCTAAA GATGGGGATT TGGTCCGCAT TGGTAAATTT 1500 

GAGTTTGAAT TTGTAGACTA GGAGACTGOT ATGGGAGATA AACCGATATC TTTCCGAGAT 1560 

GCGGATGGTA ATTTTGTTTC CGCCGCAGAC 6TTTGGAATG AAAAGAAATT GGAAGAACTA 1620 
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TTTAATCGTC TCAATCCAAA TCGTGCCTTG AGATTGGCAC GAACTAAAAA GGAAAATCCA 1680 

TCTCAGTAAA GAAGCTAAAA AATCCCGTGC CTCATCAGAC ACGGGATTTT GTGGTACGAC 1740 

AGGCATGTAT AGCAAACTGA ATCTGGAATA GCACAGCATA TCTTCTAAAA TATAGTAAAA 1800 

TGAAATGAGA ACAGGACAAA TCGATCAGGA CAGTAAAATC GATTTCTAAC AATGTTTTAT 1860 

AAGCAGAGAT GTACTATTCT AGTTTCAATC AACTATATTG TTATAAATTG ATTTGAATTT 1920 

CAAAATTAAA TTGTTTGATT CTTATTTCAA TTTGTTATAG TATATCTGAT GTCAAAGTTC 1980 

TCGGCGAGTC AAATAGCGAT TCCCAAGCCT GACTATCGTG AGGTAGCGGA TTAAAATGGT 2040 

CTGGG6ATA6 ACCGTTTTAA GTCTGACGCT GGAAATAAGA ATTGTCAGAA GAAGGGATAG 2100 

CGAAATCGTG GCTCTACGAA CAGGAACGTG ATAATAAGGC 6TATATAGCG GATAAGAGGG 2160 

CATCAAACTC TAAAGTCCAA AAAGGTAGTC GTAACCTATA TGCGTAAATC ACGAGAGTAA 2220 

TTGAATTCGT ACTAAGATTT TCTATTTTCA CTGTAACCTT TTAACXKrCCT TATATCTTGT 2280 

ATACACGAGG AAAGATGTAC GACTTATCCX: GTGAGGTCTA TCACTATAAA GAGAAAAC6A 2340 

CAGATAGAAG TGATCCTGAG TCACGGTTAT CTGTCTGATA GGACGGTATG TATAAAACGC 2400 

TTCTGTGAAC TGAGAGAAGG GGGAGAAGTT CTTGCTAAAA TTTAGTTGAA CAGCCGTATT 24 60 

CCGATACTTA GATAAGAGAT CTAGTCTTAG CTCCTACTCA GTTTTAGGGG ATAAAAAAGG 2520 

GGCAATAGCG ATTCGAGAAA GATTATACTC TTCGAAAATC TCTTCAAATC ACGTCAATAT 2580 

CGCCTTGTCG TATGTGTAGG ATACTGACTA CGTCAGTTCC ATCTACAACC TCAAAACAGT 2640 

GTTTTGAGCA ACcTGCGGCT AGTTTCCTAG TTTGATCTTT GATTTTCATT GAGTATTAGT 2700 

AATTCAGTTA CTAACTCGTC AACTCTGATT TATCCAATAA AATTGAAAAG GATGGAAAAA 2760 

AGGATAAATT TATGATATAC TTTATTTTGA AGACCTTATT AGAAATCTTG AAAGAGTATT 2820 

GAAAACTTA6 AATGAGAAAA ATTGTTATCA ATGGTGGATT ACCACTGCAA GGTGAAATCA 2880 

CTATTAGTGG TGCTAAAAAT AGTGTCGTTG CCTTAATTCC AGCTATTATC TTGGCTGATG 2940 

ATGTGGTGAC TTTGGATTGC GTTCCAGATA TTTCGGATGT AGCCAGTCTT GTCGAAATCA 3000 

TGGAATTGAT GGGAGCTACT GTTAAGCGTT ATGACGATGT ATTGGAGATT GACCCAAGAG 1060 

GTGTTCAAAA TATTCCAATG CCTTATGGTA AAMTAACAG TCTTCGTGCA TCTTACTATT 3120 

TTTATGGGAG CCTCTTAGGC CGTTTTGGTG AAGCGACAGT TGGTCTACCG GGAGGATGTG 3180 

ATCTTGGTCC TCGTCCGATT GACTTACACC TTAAGGCGTT TGAAGCTATG GGTGCCACTG 3240 

CTAGCTACGA GGGAGATAAC ATGAAGTTAT CTGCTAAAGA TACAGGACTT CATGGTGCAA 3500 

GTATTTACAT GGATACGGTT AGTGTGGGAG CAACGATTAA TACGATGATT GCTGCGGTTA 3360 

AAGCAAATGG TCGTACTATT ATTGAAAATG CA6CCCGTGA ACCTGAGATT ATTGATGTAG 3420 
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CTACTCTCTT 


GAATAATATG 


GGTGCCCATA TCCGTGGGGC 


AGGAACTAAT 


ATCATCATTA 


3480 


TTGATGGTGT 


TGAAAGATTA 


CATGGGACAC 


GTCATCAGGT 


GATTCCAGAC 


CGCATTGAAG 


3540 


CTGGAACATA 


TATATCTTTA 


GCTGCTGCAG 


TTGGTAAAGG 


AATTCGTATA 


AATAATGTTC 


3600 


TTTACGAACA 


CCTG6AAGGG 


TTTATTGCTA 


AGTTGGAAGA 


AATGGGAGTG 


AGAATGACTG 


3660 


TATCTGAAGA 


CAGCATTTTT 


GTCGAGGAAC 


AGTCTAATTT 


GAAAGCAATC 


AATATTAAGA 


3720 


CAGCTCCTTA 


CCCAGGCTTT 


GCAACTGATT 


TGCAACAACC 


GCTTACCCCT 


CTTTTACTAA 


3780 


GAGCGAATGG 


TCGTGGTACA 


ATTGTCGATA 


CGATTTACGA 


AAAACGTGTA 


AATCATGTTT 


3840 


TTGAACTAGC 


AAAGATGGAT 


GCGGATATTT 


CGACAACAAA 


TGGTCATATT 


TTGTACACGG 


3900 


GTGGACGTGA 


TTTACGTGGG 


6CCAGT6TTA 


AAGCGACCGA 


CTTAAGAGCT 


GGGGCTGCAC 


3960 


TA6TCATTGC 


TGGGCTTATG 


GCTGAAGGTA 


AAACTGAAAT 


TACCAATATC 


GAGTTTATCT 


4020 


TACGTGGTTA 


TTCTGATATT 


ATCGAAAAAT 


TACGTAATTT 


AGGAGCGGAT 


ATTAGACTTG 


4080 


TTGAGGATTA 


AACCGTAGAG 


GTGTTTATGA 


ATATTTGGAC 


CAAATTAGCA 


ATGTTTTCTT 


4140 


TTTTTGAAAC 


GGATCGCTTG 


TATTTGCGTC 


CTTTCTTTTT 


TAGTGATAGT 


CAGGACTTCC 


4200 


GCGAGATAGC 


TTCAAATCCA 


GAAAATCTTC 


AATTTATTTT 


CCCAACGCAG 


GCAAGTCTGG 


4260 


AAGAAAGTCA 


ATATGCACTG 


GCCAATTACT 


TTATGAAGTC 


CCCTTTGGGA 


GTGTGGGCAA 


4320 


TTTGTGACCA 


GAAAAATCAA 


CAAATGATTG 


GTTCTATTAA 


ATTTGAGAAG 


TTAGATGAAA 


4380 


TCAAAAAA6A AGCTGAGCTT 


GGCTATTTTT 


TGAGAAAA6A 


TGCTTGGTCG 


CAAGGATTTA 


4440 


TGACAGAGGT 


TGTTAGAAAA 


ATTTGTCAGC 


TTTCTTTTGA 


GGAATTTGGC 


TTAAAACAAT 


4500 


TATTTATCAT 


TACCCACCTT 


GAAAATAAAG 


CTAOCCAAAG 


AGTTGCTCTT 


AAGTCTGGAT 


4560 


TTAGTTTGTT 


CCGTCAGTTT 


AAGGGAAGTG 


ATCGTTACAC 


AAGAAAAATG 


CGGGATTATC 


4620 


TTGAATTTCG 


GTATGTAAAA 


GGAGAGTTCA 


ATGAGTAAGC 


ATCAGGAAAT 


TCTAAGCTAT 


41580 


TTGGAGGAAT 


TACCAGTAGG 


TAAAAGGGTC 


AGTGTTCGTA 


GCATTTCGAA 


TCATCTAGGA 


4740 


GTTAGTGATG 


GAACAGCCTA 


TCGGGCTATT 


AAAGAAGCTG 


AAAACCGTGG 


AATTGTGGAG 


4800 


ACCCGTCCTA 


GAAGTGGAAC 


AATTCGTGTT 


AAATCCCAGA 


AAGTTGCTAT 


AGAGAGATTA 


4860 


ACGTTTGCTG 


AAATTGCAGA 


AGTGACTTCT 


TCTGAGGTTC 


TGGCTGGGCA 


AGAAGGTTTA 


4920 


GAGAGAGAAT 


TTAGTAAGTT 


TTCAATTGGT 


GCCATGACTG 


AACAAAATAT 


CTTGTCTTAC 


4980 


CTTCATGATG 




GATTGTCGGA 


GACCGAACCC 


GTATTCAGTT 


GCTAGCCTTG 


5040 


GAAAATGAAA 


ATGCAGTTCT 


GGTTACAGGG 


GGATTTCAGG 


TTCATGATGA 


TGTGCTTAAA 


5100 


CTGGCCAATC 


AAAAA6GGAT 


TCCTGTTCTA 


AGAA6TAAGC 


ATGATACCTT 


TACCGTCGCG 


5160 
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ACCATGATCA ATAAAGCCTT GTCAAATGTC CAAATCAAGA CTGATATI'CT GACAGTTGAG 5220 

AAACTTTATC GCCCTAGTCA TGAGTATGGT TTTCTGAGAG AGACAGATAC AGTTAAAGAT 5280 

TATTTGGACT TGGTTCGTAA GAATCGTAGC AGCCGTTTCC CTGTTATCAA TCAACATCAG 5340 

GTCGTTGTTG GTGTTGTAAC CATGAGAGAC GCTGGTGATA AATCACCAAG CACGACAATT 5400 

GATAAGGTTA TGTCTCGTAG TCTATTTTTG GTTGGATTAT CGACAAATAT TGCCAATGTG 5460 

AGTCAACGGA TGATCGCAGA AGACTTTGAA ATGGTACCAG TTGTTCGAAG CAATCAAACT 5520 

TTGCTTGGCG TTGTGACGCG ACGAGATGTC ATGGAGAAGA TGAGCCGTTC CCAAGTTTCG 5580 

GCTCTACCAA CTTTTTCTGA GCAGATTGGA CAAAAC5CTCT CTTATCACCA TGATGAAGTA 5640 

GTCATTACAG TGGAACCCTT TATGCTAGAA AAAAATGGAG TTTTGGCTAA TGGTGTATTG 5700 

GCAGAAATTC TGACCCACAT GACCCGATTT AGTTGTTAAT AGTGGTCGCA ATCTCATTAT 5760 

CGAGCAGATG CTGATCTACT TTTTGCAGGC TGTTCAGATA GATGATATAT TGCGCATTCA 5820 

GGCACGGATT ATTCATCATA CGAGACGGTC AGCTATAATT GATTACGATA TTTATCATGG 5880 

TCACCAGATT GTTTCAAAAG CAAATGTGAC TGTTAAAATT AATTAGAAAC TAGGAGAAAA 5940 

GATGATAACA TTAAAATCAG CTCGTGAAAT CGAAGCTATG GACAAGGCTG GTGATTTTCT 6000 

AGCAAGTATT CATATAGGCT TACGTGATTT GATTAAGCCA GGCGTAGATA TGTGGGAAGT 6060 

T6AAGAATAT GTCCGCCGTC GTTGTAAAGA AGAAAATTTC CTTCCACTTC AGATTGGGGT 6120 

TGACGGTGCC ATGATGGACT ATCCTTATGC TACCTGTTGC TCTCTTAACG ATGAAGTGGC 6180 

TCACGCTTTC CCTCGTCATT ATATCTTGAA AGATGGTGAT TTGCTCAAAG TTGATATGGT 6240 

TTTGGGAGGT CCCATTGCTA AATCTGACCT AAATGTCTCA AAATTAAACT TCAACAATGT 6300 

TGAACAAATG AAAAAATACA CTCAGAGCTA TTCTGGTGGT TTAGCAGACT CATGTTGGGC 6360 

TTATGCTGTT GGTACACCGT CCGAAGAAGT CAAAAACTTG ATGGATGTAA CCAAAGAAGC 6420 

TATGTACAAG GGTATTGAGC AAGCTGTTGT TGGAAATCGT ATCGGTGATA TCGGTGCGGC 6480 

TATTCAAGAA TACGCTGAAA GTCGTGGTTA CGGTGTAGTG CGTGATTTGG TTGGTCATGG 6540 

TGTTGGCCCA ACTATGCACG AAGAACCAAT GGTTCCTAAC TATGGTATTG CAGGTCGTGG 6600 

ACTCCGTCTT CGTGAAGGAA TGGTCTTAAC CATTGAACCA ATGATCAATA CAGGCGATTG 6660 

GGAAATTGAT ACAGATATGA AAACTGGTTG GGCGCATAAG ACCATTGACG GTGGATT6TC 6720 

ATGTCAGTAT GAACACCAAT TTGTCATTAC GAAAGATGGA CCTGTTATCT TGACTAGCCA 6780 

AGGTGAAGAA 6GAACTTATT AATAAAAAGT GAAAAGACTA CTGGAAGTTT ATTTTGATAA 6840 

AAAATCCAGT AGATCTTTTC ATAATAAAAC GCATTGTATC AAGTGTTAGG GGCTGATATC 6900 

ATGCGTTTTT CTGCTTTTAA GATTTTTTCC AACTCTGTTT GTAAGC6CAT CATAACAAAG 6960 
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GGTCTAGGAT TCAGGGCTCT CCTCCTATAT ACTATTA6TA AAGTAAAACT AAGGGAGGAT 7020 

ATTTTAGTGT CGCAGTCTAT TGTTCCTGTA GAGATTCCAC AATATTGTCG TTTTGATTCT 7080 

AAAAAGAGAA ATGGAATTCT GTTTAATGTT CGTATTGCCA ATCTTAAATT TACTTTTTTA 7140 

TATTATACTT CCTGCGAAAC AAAATATGGT ATAGTAGTTC TATGAATGAT GAAGCAAGTA 7200 

AACAACTAAC TGATGCACGA TTTAAGCGTC TTGTTGGTGT TCAGCGTACC ACTTTTGAAG 7260 

AGATGTTAGC TGTATTAAAA ACAGCTTATC AACTTAAACA CGCAAAAGGT GGACGAAAAC 7320 

CTAAATTAAG CCTAGAAGAC CTTCTTATGC CCACTCTTCA ATAGTGCGAG AATATCGAAC 7380 

TTATGAAGAA ATTGCGGCTG ATTTTGGTAT TCACGAAAGC AACTTTATCC GTCGGAGCCA 7440 

ATGGGTTGAA ATAACTCTTG TTCAAAGTGG TTTTACGGTT TCAAGAACTC CTCTCAGTTC 7500 

TGAGGACACO GTAATGATTG ATGCGACGGA AGTAAAAATC AATCGCCCTA AAAAAACAAT 7560 

TAGCGAATGA TTCTGGTAAA AAGAAATTTC ACGCTATGAA OGCTCAAGCG ATTGTCACAA 7620 

GTCAAGGGAG AATTGTTTCT TTGGATATCG CTGTGAACTA TAGTCATGAT ATGAAGTTGT 7680 

TCAAAATGAG TCGTAGAAAT ATCGAACAAG CTGGTAAAAT CTTGGCTGAC AGTGGl'TATC 7740 

AAGGGCTCAT GAAGATATAT CCTCAAGCAC AAACTCCACG TAAATCCAGC AAACTCAAGC 7800 

CGCTAACAGC TGAAGATAAA GCCTATAACC ATGCGCTATC TAAGGAAAGA AGCAAGGTTG 7860 

AGAACATCTT TGCCAAAGTA AAAACGTTTA AAATATTTTC AACAACCTAT CGAAATCATC 7920 

GTAAACGCTT CGGATTACGA ATGAATTTGA GTGCTGGTAT TATCAATCAT GAACTAGGAT 7980 

TCTAGTTTTG CAGGAAGTCT ATT6AG6TAT TGAGCTAGTT TATGAAAAAA TTGGGTGAAA 8040 

AGTCGAGTGT TTTAGAAACC CACAGTGTAG TATTCTAGTT TCAATCCACT ATATTTTGCT 8100 

ACTCCCCGTA AAGTTTCTAT TTTCCCTGAT TTCTGATATA ATAGAAATAT TGACTTCAAG 8160 

AGTAAGGAAG AGAAGATGAA CGCATTATTA AAOX^GAATGA ATGACCGTCA GGCTGAGGCX3 8220 

GTGCAAACGA CAGAA06TCC CTTGCTAATC ATGGCAGGGG CTGGTTCTGG AAAGACTCGT 8280 

GTTTTGACCC ACCGTATCGC TTATTTGATT GATGAAAAGC TGGTCAATCC TTGGAATATC 8340 

TTGGCCATTA CCTTTACCAA CAAGGCTGCG CGTGAGATGA AAGAGCGTGC TTATAGCCTC 8400 

AATCCAGCGA CTCAGGACTG TCTGATTGCG ACCTTCCACT CCATGTGTGT GCGTATTTTG 8460 

CGTCGCGATG CGGACCATAT TGGCTACAAT CGTAATTTTA CAATTGTGGA TCCTGGTGAA 8520 

CAGCGAACGC TCATGAAACG TATTCTCAAA CAGTTGAACT TGGACCCTAA AAAATGGAAT 8580 

GAACGAACTA TTTTGGGGAC CATTTCCAAT GCTAAGAATG ATTTGATTGA TGATGTTGCT 8640 

TATGCTGCCC AAGCTGGCGA TATGTATACG CAAATTGTGG CCCAGTGTTA TACAGCCTAT 8700 
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CAAAAAGAAC TTCGTCAGTC TGAATCCGTT GACTTTGATG ATTTGATTAT GCTGACCTTG 8760 

CGTCTCTTTG ATCAAAATCC TGATGTTTTG ACCTACTACC AGCAAAAATT CCAATACATC 8820 

CACGTTGATG AGTACCAAGA TACCAACCAC GCTCAGTACC AATTGGTCAA ACTCTTGGCT 8880 

TCCCGTTTTA AAAATATCTG TGTGGTTGGG GATGCGGACC AGTCTATCTA CGGTTGGCGT 8940 

GGTGCTGATA TGCAGAATAT CTTGGACTTT GAAAAGGATT ACCCCAAAGC CAAGGTTGTT 9000 

TTGTTGGAGG AAAATTACCG CTCAACCAAA ACCATTCTCC AAGCGGCCAA CGAGGTTATT 9060 

AAAAATAATA AAAATCGCCG TCCTAAAAAT CTCTGGACTC AAAACGCTGA TGGGGAGCAA 9120 

ATCGTTTACT ATCGTGCCGA TGATGAGCTG GATGAGGCTG TATTT6TAGC CAGAACCATC 9180 

GATGAACTTA 6TCGCAGTCA AAACTTCCTT CATAAGGATT TTGCAGTTCT CTATCGGACT 9240 

AATGCCCAGT CCCGTACAAT TGAGGAAGCC CTGCTCAAGT CTAACATTCC TTATACCATG 9300 

GTTGGCGGAA CCAAATTCTA CAGCCGTAAG GAAATTCGCG ATATTATTGC TTATCTCAAC 9360 

CTTATTGCTA ATTTGAGTGA CAATATTAGT TTTGAGCGTA TTATCAACGA GCCTAAACGT 9420 

GGAATTGGTC TAGGTACAGT TGAGAAAATC CGTGATTTTG CAAATTTGCA AAATATGTCT 9480 

ATGCTGGATG CTTCTGCTAA TATTATGTTG TCTGGTATCA AGGGTAAGGC AGCCCAATCT 9540 

ATCTGGGATT TTGCCAATAT GATGCTTGAT TTGCGGGAGC AGCTAGACCA CTTAAGCATT 9600 

ACAGAGTTGG TTGAGTCCGT CCTAGAAAAA ACAGGTTATG TCGATATTCT TAACTCCCAA 9660 

GCGACTCTAG AAAGCAAGGC ACGGGTTGAA AATATCGAAG AGTTTCTTTC TGTTACGAAG 9720 

AACTTTGATG ACACCACGGA TGTGACAGAA GAGGAAACTG 6TCTGGACAA ACTGAGTCGT 9780 

TTCTTAAATG ACTTGGCTTT GATTGCCGAC ACAGATTCAG GTAGTCA6GA GACATCAGAA 9840 

GTGACCTTGA TGACCCTGCA TGCTGCCAAA GGTCTCGAAT TTCCA6TTGT CTTTTTGATT 9900 

6GGATGGAAG AAAATGTCTT TCCACTTAGT CGTGCGACTG AAGATTCAGA TGAATTAGAA 9960 

GAAGAGCGCC GTCTAGCCTA TGTAGGTATC ACGCGTGCAG AGAAAATTCT CTATCTGACC 10020 

AATGCCAACT CACGCTTGCT TTTTGGTCGT ACCAATTATA ACCGTCCGAC TCGTTTTATT 10080 

AACGAAATCA GTTCAGACTT GCTTGAGTAT CAAGGTCTGG CTCGTCCTGC AAATACAAGC 10140 

TTTAAGGCAT CATATAGCAG TGGTAGTATT TCCTTTGGTC AAGGTATGAG TTTGGCTCAG 10200 

GCTCTTCAAG ACCGTAAAC6 CGGTGCTGCC CCAAAATCAA TCCAGTCAAG CGGTCTTCCA 10260 

TTTGGTCAAT TTACAGCTGG CGCAAAACCA GCATCTAGCG AGGCAAATTG GTCCATTGGT 10320 

6ATATTGCTC TCCACAAGAA ATGGGGA6AG GGAACCGTTC TGGAAGTTTC AGGTAGCGGT 10380 

GCTAGGCAGG AATTGAAAAT CAATTTCCCA GAA6TAGGTT TGAAAAAACT TTTAGCCAGT 10440 

6TGGCTCCAA TTGAGAAAAA AATCTAATTT TCCATCCTTC TCACGAATAA TAAAGTGAGG 10500 
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AGGATTTTTA TGTACAGTAT TTCATTCCAA GAAGATTCAC TATTACCAAG AGAAAGGCTG 10560 

GCCAAGGAAG GAGTTGAAGC GCTTAGTAAC CAAGAGTTGC TAGCTATTTT ACTCAGGACA 10620 

GGAACACGTC AAGCTAGCGT TTTTGAAATT GCXTCAAAAAG TCTTGAACAA TCTTTCAAGC 10680 

CTAACGGATT TGAAAAAAAT GACCCTGCAG GAATTGCAGA GTTTGTCTGG TATTGGGCGT 10740 

GTTAAGGCCA TAGAATTACA AGCTATGATT GAACTGGGGC ATCGTATTCA CAAACACGAG 10800 

ACTCTTGAAA TGGAAAGTAT TCTCAGCAGT CAAAAGTTGG CCAAGAAGAT GCAGCAGGAA 10860 

TTAGGGGATA AAAAACAAGA GCACCTGGTG GCACTCTATC TCAATACTCA AAATCAAATC 10920 

ATCCATCAGC AGACCATTTT TATCGGGTCT GTAACTCGTA GTATCGCTGA ACCGCGAGAG 10980 

ATTCTTCACT ATGCAATCAA GCATATGGCG ACTTCTCTTA TCTTGGTCCA CAATCATCCT 11040 

TCAGGAGCGG TAGCGCCTAG CCAAAATGAT GATCATGTCA CTAAACTTGT TAAAGAA6CC 11100 

TGCGAATTGA TGGGGATTGT TCTCTTGGAC CATTTGATTG TCTCTCATTC TAATTACTTT 11160 

AGTTATCX5TG AAAAGACAGA TTTAATCTAA AGTTCATTAA CGACATAGTC AAAGAGTTTT 11220 

TTATCTTTGG GACGATTTTC AAAAAGAAGT TCTGGATGCC ATTGGACACC GAGAAAGGCG 11280 

ACATCATCCG TACTCATGAC AGCCTCAATG ATACCATCTT TAGGATCATG AGCCACAACT 11340 

TTTAAATTTG GTGCTAAGTC CTTGATGCTC TGGTGGTGGA AGGAGTTGAT ATGAGAGATT 11400 

TCTCCATAGA TTTCTTGGAG AACGGTATCT GGTTCTGTTA CCAAGCGTTG AGTTGTGTAC 11460 

TCAACAGAAG AATCCTGCCA ATGGTCTTCG ATATCTTGGT ACAAAGTTCC ACCCATGGCA 11520 

ACGTTAAAGA GTTGGGTACC ACGGCAGACA GAGAAAATGG GCTTTTTCTG TTTAATAGCT 11580 

TCCTTGATGA GGGCCAGTTC GAAGATATCT CTTTGAAGGT GATAGTCATC ACTATCAATG 11640 

GTTTTGGGTT CGCCATAAAA TTTTGGATCG ACATTTTGCC CACCTGTCAA GATGAGCTTG 11700 

TCAATCAAAC TGATATAGTG GCAGGCCATT TCTTGATCAC CAATCGGTAG GATGATGGGA 11760 

ATCCCTCCA6 CATCTTTAAC GCCTTCAACA AAGCCTTTTG CTGCGTAGCT CATCATGATG 11820 

TCATCATCTG GATGAGTTTT TTCGTTTCCT GTAATCCCAA TAACTGGTTT TTTCATAAAA 11880 

TGATTTTCGC TTTCTAATCC TCTTTTCGCA TGAAGTAGAG GAGGGTTTGG AGTTCACTTG 11940 

TCAAATCGAC ATACTGAACG ACCACGTCTT TTGGTAAATG CAGATGGACT GGTGAAAAAC 12000 

TGAGAATTCC TTTCACACCA GCATCAACCA AGAGATTAGC AACCTCTTGT GACTTGACGC 12060 

TGGGAACAGT TAGGATAGCA GTCTTCACAT CAGCATCCTT GATTTTATCC TTGATCTGAG 12120 

AAATCCC6TA AATGGGAATC CCGTCAGGAG TTTG6GTACC GACTTCAGGA TGGTCGTCTA 12180 

GGTCAAAGGC CATGATAATC TTCATCTTGT TACGTTCGTG GAAGCGGTAG TGGAGAAGGG 12240 
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CATGGCCCAT ATTTCCAATA CCAACCAGCA TGACATTGGT AATAGAGTTG TCATTGAGCA 12300 

AATCGGCAAA AAATGTCATT AGTTTTTTGA CATCATAGCC AAAACCACGA CGACCAAGTT 12360 

CACCAAAATA GGAAAAATCA CGACGTACGG TCGCTGAATC AATACCGATA GCCTCTGCAA 12420 

TTTGCTTAGA GTTGGCACGT TCAATCTTTT CTGCATGAAA TCTCTTAAAA ATTCGATAGT 12480 

AGAGAGAGAG TCTTTTTGCT GTAGCTTTTG GAATAGCAAA CTGTTTATCT TTCACAAAAT 12540 

CACAACCTTT CTATTCTTCT ATTTTATAGA AACATTGTGA AAAAATCAAC AAAAATAAGA 12600 

AAAAACTAAG AAAAATCTTA GTTTTGATGT AAAAAATCTG CATGAGATAG. AAAACGGTAG 12660 

AGGTCTCCGA CCAGCCCCTG ATAAACTTTT TTGCCXTCTAA AAGTCAGAGA AGTCACATAA 12720 

AGTGTATCTG GTAAG6TTAC ACATCCTGAC AAAGTCAACA TGAGAGCCTC ATGATCCTCA 12780 

TACTTGAGAG TACGCTCTAC ATGATAGCAG TCCTTATAGG TCAGTTCAAA CATTTTGGCT 12840 

CTATCTTTCC GATTTTGTAA AGACACCACG TTCTACCAAG CTATCCATGA GGAAGTAGAA 12900 

TTTTTCCTGA TGAATATGGT GGTCTTCTGA TTTGAAAATA TCAACTAGAC GAAGGCCAAA 12960 

CTTGTCAGTG ATATTGATTT TAGCCCCTGT AAGTTCCTTG TTAATGATGA TTTTGAGTTG 13020 

GAAGCCTTCA CCGCTGTTTG GCACTTTTTC CAAAAGGCGA GTCAGTTCAT AGTTACCAAC 13080 

CTTAGTTTCA AAAAAGGTGT TATCTTTGAG GGTGAATTTT TTAACAGAAG GGCTAAGAGT 13140 

GTAATCGTAA CGACAATTTT TTAACTGAAT GATTTTTTCA AATGCCATAT GGCTAACCTC 13200 

CGATAATTTC TTTTAAGGTT TTTGCGAGGG TTTGTAGGTC TTCAACGGTA TTTTGTGGCG" 13260 

ACAAACTGAT GCGAAGGGAT TCCTTCAAGC GTTCTGAATT TGCGCCATAC ATGGCTTCAA 13320 

GAACATGGCT GGATTGGACA ACGCCTGCAG TACAGGCTGA GCCAGTAGAG ATTGAAATTC 133B0 

CAGCTAAATC TAGCCGAAGG AGTAAGAGGT CATTTTTCTG ACCAGGAAAT CCAATATTGA 13440 

GAACATAAGG 6AGATGATGT TTTCCTCTAT TCAGGTAATA CTGAATGCCC TCCAGCTCTG 13500 

CCAGAAAGGC AGTTTCTAGA TTTTGTACAT GTTGAAAATG TTCTTCTTGT TTTTCTAGGT 13560 

CTTCTTTTAG GGCTGCAACC ATGCCTACAA TGGCAGGCAG ATTTTCAGTT CCTGCACGTT 13620 

TTTTCTGTTC CTGGTCTCCG CCATGTAGAT AGGAATCAAA GTCCATGCTA GATGCGTAGA 13680 

GAAAACCGAT TCCCTTAGGA CCATGGAATT TGTGGGCAGA AGCAGTGAGA AAATCAATGC 13740 

CCAATTCTTC TGAATGAATT GGGATTTTAC CAATAGCXTTG AACTGCATCA ACATGATAGG 1380O 

CAGCAGGGTG TTGCTTGAGT ATTTGGCCAA TTTCAGCGAT GGGCAGTAGG TTTCCTGTCT 13860 

CATTATTGAC AAACATGGTA GAAACCAAAA TCGTATCGTC ACGTAAAGCC TTTTGAATTT 13920 

GCTGGGCTGT GATTTCTTGA TTTTCTGGCT CGATAATGGT TGCTTCAAAC CCAAAGTGTT 13980 

GAACCAAGTA ATCAATTGTT TCAAGGACAG CATGGTGCTC GATGGCAGTT GTGATGATAT 14040 
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GTTTTCCTTG 


TTCTTGGTGA CGAAGACAGT 


AGCCAATGAT 


GGTAGTATTA 


TTGCCTTCAG 


14100 


TCCCACCAGA AGTGAAAAAG ATATGTTGAG 


GTTTTGTCCT 


TAGTAACTGG 


GCTAGTTCCT 


14160 


C5ACGGGCTTC 


TCGCAAGAGT TTGCCAGCTT 


GACGACCATG 


ACCATGAATA 


CTAGAAGGAT 


14220 


TTCCCTGGGT 


TTCTTGCATA ACCTTGGTCA TAGCTGAAAT AGCAACTGCT GACATAGGAG 


14280 


TCGTTGCAGC 


ATTGTCCAAA TAAATCAAAG 


AATCACCTTA 


TTTCTTTTTA 


TTGTAGGCAA 


14340 


AGAGTGGGCT 


GACTGGTTTT CTTTCGTGAA 


TACGGACGAT 


AGCATCACCA ATTAACTCAC 


14400 


TAGCAGTGAT 


GTAGCATACA TTTTTAGGAG 


TTTTTTCTTT 


TGTTGCTACT 


GAATCAGTCA 


14460 


CAAGAATTTC 


TTTAATATTA GTATTGTCAA 


GAAGCTCAGC 


AGCTCCCTCG 


ACGAAGAGAC 


14520 


CGTGGCTAGA 


AACAGCATAA ATTTCTGTAG 


CTCCTTCACG 


TTCAACGATT 


TTAGAAGCTT 


14580 


CAGAGAAGGT 


ACGTCCTGTA TTTAAAATAT 


CATCAATCAA GATAGCTTTC 


TTACCTTCAA 


14640 


CATCACCAAT 


AATATAACCT TCGTTACGAG 


TTGCATCGTC 


TTGAGGGTAG 


TCGATAATGG 


14700 


CGATAGGAGC 


ATCAAGATAT TCAGCCAGGC 


TACGCGCACG 


TTTGACACCT 


GAATTTTTAG 


14760 


GGCTAACGAC 


AACAACATCT GAACCAAGCA 


ATCCTTTATC 


GCAGTAATGT 


TTTGCGAATA 


14820 


GGGGAACAGT 


6AAAAGATTA TCCACTGGAA 


TATCAAAGAA 


ACCTTGAACC 


TGAACGGCAT 


14860 


GCAAATCAAG 


AGTCAGGATA CGATCAACTC 


CAGCCTTAAC 


CAGCATATTG 


GCAACTAGTT 


14940 


TTGCTGTAAG 


TGGCTCACGA GGACAAGCAA 


TGCGGTCTTG 


ACGTGCATAG 


CCAAAATATG 


15000 


GAAGGACAAC 


GTTGATACTG TGGGCACTTG 


CACGCACACA 


AGCATCGACC 


ATGATTAACA 


15060 


ATTCCATTAG 


GTGGTTGTTG ACAGGGAAAC 


TTGTTGATTG 


GATGATGTAA 


ACATCATAAC 


15120 


CACX3GACACT 


TTCTTCGATA TTTACTTGGA 


TTTCTCCGTC 


TGAAAATTGA 


CGTGATGATA 


15180 


GTTTTCCAAG TGGGACACCA ACAGCTTGGG CAATTTTTTG TGCAATCTCT TGGTTAGAGT 


15240 


TGAGTGCGAA 


AAGTTTCATG TTTTTTCTAT 


CTGACATTAT 


AGACCGTCCT 


CTGTAAACTT 


15300 


TATAAATCCT 


AGTTATATTT ACCTTACATA 


TATGAACTGG 


GATTTGTGTA 


TTTTTATCTT 


15360 


TTCTATTTTA 


CCAAAAAATG GAGATTATTT 


CAGCTATTTT 


TCATACTTTT 


GACAAATCGA 


15420 


ACCAATTTTG 


AAGGAGCTTT TTGATAGGAA 


ATCTGATTTT 


TCTCTAAAAA 


TTGTCGAAAA 


15480 


TCCTGTTTGC 


CTTGCTCATG ATTTTCCACT 


TCAAGCTCCA 


ATTCGTAATC 


TGTTATATCA 


15540 


AAGTATCGGC 


TCTGATCCAG TGCCATGAGA 


CCAATAGCTG 


TTTTCATTTC 


ATAGCGAAGC 


15600 


GTTGTTAGAC 


AACCAAGAAC CTGCCAGTTC 


TTACTTTGGA 


TACCATGTTT 


CGCCAATTCA 


15660 


TCXa^GTACTA GCCCTTGAGG AAGTTCTTCC TTACTCAGAT 


AGTTCTCAGC ATCTTTTAGT 


15720 


TGCAATTTTT 


GGTTGTATTC CATGTTTCXIA ACACTCTGCG 


GGACTTTGAG 


TGTCAACTCA 


15780 
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GCCCAGTCTT CAAAGGTTCG AATGCGCATA 6CGACTTTCT TTTCTCGCAG TTCAAAATCA 15840 

GGCGTGTCGA TGTAGTAATT TGTTTGAAGA ACAGGAGTGA CACCTGTGAA CTGGTCTTTT 15900 

AGACGATTGT ATTCATCTTT TTTCAATAGT GTTTTCAATT CAATTTCTAA ATGTTTCATT 15960 

TTTCTTACCT TTTTTTATCG TTGAAAGCGG ATTTATGGTA TAATAAGCAT TGTATTTATT 16020 

GTATATGAAT CTGGAGAAAA AATCAAAGAT ATTTTTGACG GATAATATGA GAACAAGGGA 16080 

GAATATATGA CCTTAGAATG GGAAGAATTT CTAGATCCTT ACATTCAAGC TGTTGGTGAG 16140 

TTAAAGATTA AACTTCGTGG TATTCGTAAG CAATATCGTA AGCAAAATAA GCATTCTCCA 16200 

ATTGAGTTTG TGACX:GGTCG AGTCAAGCCA ATTGAGAGCA TCAAAGAAAA AATGGCTCGT 16260 

CGTG6CATTA CTTATGCGAC CTTGGAACAC GATTTGCAGG ATATTGCPGG CTTACGTGTG 16320 

ATGGTTCAGT TTGTAGATGA CGTCAAGGAA GTAGTGGATA TTTTGCACAA GCGTCAGGAT 16380 

ATGCGAATCA TACAGGAGCG AGATTACATT ACTCATAGAA AAGCATCAGG CTATCGTTCC 16440 

TATCATGTGG TAGTAGAATA TACGGTTGAT ACCATCAATG GAGCTAAGAC TATTTTGGCA 16500 

GAAATTCAAA TTCGTACTTT GGCCATGAAT TTCTGGGCAA CGATAGAACA TTCTCTCAAC 16560 

TACAAGTACC AAGGGGATTT CCCAGATGAG ATTAAGAAGC GACTGGAAAT TACAGCTAGA 16620 

ATCGCCCATC AGTTGGATGA AGAAATGGGT GAAATTCGTG ATGATATCCA AGAAGCCCAG 16680 

GCACTTTTTG ATCCTTTGAG TAGAAAATTA AATGACGGTG TAGGAAACAG TGACGATACA 16740 

GATGAAGAAT ACAGGTAAAC GAATTGATCT GATAGCCAAT AGAAAACCGC AGAGTCAAAG 16800 

GGTTTTGTAT GAATTGCGAG ATCGTTTGAA GAGAAATCAG TTTATACTCA ATGATACCAA 16860 

TCCGGATATT GTCATTTCCA TTGGCGGGGA TGGTATGCTC TTGTCGGCCT TTCATAAGTA 16920 

CGAAAATCAG CTTGACAAGG TCCGCTTTAT CGGTCTTCAT ACTGGACATT TGGGCTTCTA 16980 

TACAGATTAT CGTGATTTTG AGTTGGACAA GCTAGTGACT AATTTGCAGC TAGATACTGG 17040 

GGCAAGGGTT TCTTACCCTG TTCTGAATGT GAAGGTCTTT CTTGAAAATG GTGAAGTTAA 17100 

GATTTTCAGA GCACTCAACG AAGCCAGCAT CCGCAGGTCT GATCGAACCA TGGTGGCAGA 17160 

TATTGTAATA AATGGTGTTC CCTTTGAACG TTTTCGTGGA GACGGGCTAA CAGTTTCGAC 17220 

ACCX3ACTGGT AGTACTGCCT ATAACAAGTC TCTTGGCGGT GCTGTTTTAC ACCCTACCAT 17280 

TGAAGCTTTG CAATTAACGG AAATTGCCAG CCTTAATAAT CGTGTCTATC GAACACTGGG 17340 

CTCTTCCATT ATTGTGCCTA AGAAGGATAA GATTGAACTT ATTCCAACAA GAAACGATTA 17400 

TCATACTATT TCGGTTGACA ATAGCGTTTA TTCTTTCCGT AATATTGAGC GTATTGAGTA 17460 

TCAAATC6AC CATCATAAGA TTCACTTTGT CGCGACTCCT AGCCATACCA GTTTCTGGAA 17520 

CCGTGTTAAG GACGCCTTTA TCGGCGAGGT G6ATGAATGA GGTTTGAATT TATCGCAGAT 17580 
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GAACATGTCA 


AGGTTAAGAC 


CTTCTTAAAA 


AAGCACGAGG 


TTTCTAAGGG 


ATTGCTGGCC 


17640 


AAGATTAAGT 


TTCXAGGTGG 


AGCTATTCTG 


GTCAATAATC 


AACCGCAAAA 


TGCAACGTAT 


17700 


CTATTGGACG 


TTGGAGACTA 


CGTTACCATT 


GACATTCCCG 


CTGAGAAAGG 


CTTTGAAACC 


17760 


TTGGAGGCTA 


TTGAGCTTCC 


ATTAGATATT 


CTCTATGAGG 


ATGACCACTT 


TCTAGTCTTG 


17820 


AATAAACCCT 


ATGGAGTGGC 


TTCTATTCCT 


AGTGTCAATC 


ACTCTAATAC 


CATTGCCAAT 


17880 


TTTATCAAGG 


GTTACTATGT 


CAAGCAAAAT 


TATGAAAATC 


AGCAGGTTCA 


CATTGTTACC 


17940 


AGACTAGATA 


GGGATACTTC 


TGGCTTGATG 


CTCTTTGCCA 


AGCACGGTTA 


TGCCCATGCA 


18000 


CGATTA6ACA AGCAGTTGCA GAAGAAATCT 


ATCGAGAAAC 


GCTACTTTGC 


TTTGGTTAAG 


18060 


GGAGATGGAC ATTTGGAGCC 


AGAAGGGGAA 


ATTATTGCTC 


CGATTGCGCG 


TGATGAAGAT 


18120 


TCCATTATTA 


CCAGAC6AGT 


GGCTAAAGGC 


GGAAAGTATG 


CCCATACTTC 


ATACAAGATT 


18180 


GTAGCTTCTT ATGGAAATAT 


TCACTTGGTC 


TATATTCACC 


TGCACACTGG 


TCGAACCCAT 


18240 


CAAATCCGAG 


TCCATTTTTC 


TCATATCGGT 


TTTCCTTTGC 


TGGGAGATGA 


TTTGTATGGT 


18300 


GGTAGTCTGG 


AAGATGCSTAT 


TCAAC6TCAG 


GCTCTGCATT 


GCCATTACCT 


ATCCTTTTAT 


18360 


CATCCATTTT 


TAGAGCAAGA 


CTTGCAGTTA 


GAAAGTCCCT 


TGCCGGATGA TTTTAGTAAC 


18420 


CTTATTACCC 


AGTTATCAAC 


TAATACTCTA 


TAAAAACTGT 


CTCAGAGTAT 


AATTATTATC 


18480 


TTAAAGGAGA 


AAACTCATGG 


AAGTTTTTGA 


AAGTCTCAAA 


GCCAACCTTG 


TTGGTAAAAA 


18540 


TGCTCGTATC 


GTTCTCCCTG 


AAGGGGAAGA 


GCCTCGTATT 


CTTCAAGCAA 


CAAAACGCTT 


18600 


AGTAAAAGAA 


ACAGAAGTGA TTCCTGTTTT 


GCTTGGAAAT 


CCTGAAAAAA 


TTAAAATTTA 


18660 


TCTTGAAATT 


GAAGGAATCA 


TG6ATGGTTA 


TGAGGTCATC 


GACCCTCAAC 


ATTATCCTCA 


18720 


ATTTGAAGAA ATGGTTTCTG 


CCTTGGTGGA 


GCGTCGCAAG 


GGCAAAATGA 


CTGAAGAAGA 


18780 


TGTACGCAA6 


GTTTTGGTTG 


AA6ATGTCAA 


CTACTTTGGT 


GTGATGTTGG 


TTTACTTGGG 




CTTGGTTGAT 


GGAATGGTGT 


CAGGAGCGAT 


TCACTCAACA 


GCTTCAACAG 


TTCGCCCAGC 


18900 


TCTACAAATC 


ATCAAAACTC 


GTCCAAATGT 


AACTCGTACT 


TCAGGAGCCT 


TCCTCATGGT 


18960 


TCGTGGTACG 


GAACGTTACC 


TATTTGGAGA 


CTGTGCCATT 


AACATCAATC 


CAGATGCAGA 


19020 


AGCCTTGGCT 


GAAATTGCCA 


TCAACTCAGC 


AATCACAGCT 


AAGATGTTTG 


GCATCGAACC 


19080 


TAAAATTGCC 


ATGTTGAGCT 


ATTCTACTAA AGGTTCAGGG 


TTTGGTGAAA 


GCGTTGATAA 


19140 


GGTCGTTGAA 


GCAACTAAAA 


TTGCTCACGA 


CTTGCGTCCT 


GACCTTGAAA 


TCGATGGTGA 


19200 


GTTGCAATTT 


GATGCAGCCT 


TTGTTCCTGA 


AACTGCAGCT 


CTGAAAGCTC 


CTGGAA6TAC 


19260 


GGTAGCTGGT 


CAAGCAAATG 


TCTTCATCTT 


CCCAGGTATC 


GAGGCA6GAA ATATTGGTTA 


19320 
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CAAGATGGCT GAACXjCCTGG GTGGCTTTGC GGCTGTAGGA CCTGTTTTGC AAGGTTTAAA 19380 

CAAGCCAGTT AATGATCTTT CTCX3TGGATG TAATGCAGAT GATGTTTACA AGTTGACCCT 19440 

CATCACAGCA GCTCAAGCAG TTCATCAATA GTGAAAACTA TAAAGTGATA TACTATGCTA 19500 

TACTGTAGTT ATGAAACTAT GTACGAAAAG CACTGCCATT AATTCCTGAG AACTAAATTA 19560 

CTGATTGGTG TCAAAAAGGA AAACTTCCAA GCGATGATAT CCTGTCTATA CACGACCTAT 19620 

AGAAATCTGT AATATACATA TCCGTAAAAC GATAAATTCC CTTTTTGATT TTAAATGAGT 19680 

ATGAAAAGAG AATTTTTTGG CTCTTTGTCA ACTGTAGTGG GTTGAAGAAA AGCTAAGCTC 19740' 

GAGAAAGGAC AAATTTCATC CTTTCTTTTT TGATATTCAG AGCGATAAAA ATCCGTTTTT 19800 

TGAAGTTTTC AAAGTTCCGA AAACCAAAGG CATTGCGCTT GATAAGTTTG ATGAGATTAT 19860 

TGGTCGCTTC CAGTTTGGCG TTAGAATAGT GTAGTTGA/^ G6CGTTGATA ATCTTTTCTT 19920 

TATCrrTTGAG GAAGGTTTTA AAGACAGTCT GAAAAATAGG ATGAACCTGC TTAAGATTGT 19980 

CCTCAATAAG TCCGAAAAAT TTCTCTGGTT CCTTATTCTG GAAGTG7VAAA AGCAAGAGTT 20040 

GATAGAGCTG ATAGTGGTGT TTCAAGTCTT CCGAATAGCT CAAAAGCTTG TTTAAAATCT 20100 

CTTTATTGGT TAAGTGCATA CGAAAAATAG GACGATAAAA TCGCTTATCA CTCAGTTTAC 20160 

GGCTATCCTG TTGAATGAGT TTCCAGTAGC GCTTGATAG 20199 
(2) INFORMATION FOR SBQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ACCCGATGTA TCAGCGGATA TTTACTCTAT TTTTCAAACG ATGTTATACC CACAATAAAA 60 

GAAAAAAGAC CCTAAGGTCT CCTTTGCTTT TATTATTAAA CGCGTTCAAC TTTACCTGAT 120 

TTCAAAGCAC GAGCTGAAGC CCAAACTTTT TTAGGTTTAC CATCGATAAG AACAGTAACT 180 

TTTTGAAGGT TTGGTTTTAC GGCACGTTTT GTTTGGTTCA TCGCGTGTGA ACGGTTGTTT 240 

CCTGATACAG TCTTACGACC TGTAAA6TAA CATACTTTAG CCATTGTGTT TTCCTCCTAT 300 

TAGATCTAAT ATAGCGGATG TGCTAGCACC ACATACCGTA CTATGTTATC ACATTTTCTT 360 

GTTTTTTGCA AGGGAATTGG AAGATTTTTT ATTTGTGTCT TAAATCAGGT CTTGCGTGAC 420 

ATTTcTGCTC TCCACATGCC ATCGTTGATT AACA6AACAC CAGAATTAAA ATTATGTGTA 480 

TAAAAATCAT CTCTAACTGC AGCTAAGGGT ATAGCCGTCA AGTCCAAATC CCACA6CTCA S40 
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TCTATCGATT 


TTCTTACAAC 


AATATCTGAA 


TCCAAATACA 


GTACACGAGA 


CTCGCTTACA 


600 


TACTTTGGAA 


TAAAATACCT 


AAAAAAGCCG 


CATATGAAAG 


TCCCTCAAAG 


GGGAGACX3AT 


660 


AACCTTTCAG 


AATATTACTG 


TCAATCTAAA 


CATTCACAAT 


CTCACTATTC 


AAAGTCTCTA 


720 


GTCTTTTTTC 


CATCAATTGG 


AACCATTCTC 


gcx;gaaggtc 


ATCATTAAAA 


ACATAAAACT 


780 


TAAGATTATA 


ATGATGAACA 


CAAAGAGATT 


TTATTGTTGT 


TTCAACTTTA 


TCCATATAAG 


840 


CATTATCTGC 


ACCTAAGACA 


ATCGCTTTTT 


TCTCTTCTTT 


CACTTTTTAT 


CTCATTTCTT 


900 


TTTATTCCCA 


TCATATTATT 


CCCATCATAT 


GTTTCCCATC 


ATATGTTTCT 


ACGTAACCAT 


960 


TATTTTCGCC 


TATTCGTTCG 


TAAAACCATA 


CCAGTGGAGA 


TTTTAGATGA 


AGTCCCATTA 


1020 


CGGTTTACAA TTTTTACATT ACGACACGGA 


GTTTTACAAA 


TCGATTTCAT TTGCCAAACG 


1080 


TAGTTAGTGA 


GGCAGTTAGC 


TAGTTCGCCA 


AATAOCGACT 


AGCGTCCAAC 


AATTTGGAAC 


1140 


TTTAGTTCCA 


ATTGTTGGTA 


CTGAGTCACA 


TCTTCTCCTC 


TAACTCTACG 


TCTGGATACT 


1200 


TGTCCGCAAA 


CCAGCGGAGG 


GCAAAGTCAT 


TTTCAAAGAG 


AAAGACTGGT 


TGGTCAAAAC 


1260 


GGTCTTTGGC 


TAAGATATTG 


CGACTTGACG 


ACATCCGTTC 


ATCCAAGTCC 


TCAGGCTTGA 


1320 


TCCAACGAAC 


GGTCTTTTTA 


CCCATTGGGT 


TCATAACTAC 


TTCCGCATTG 


TACTCGCCTT 


1380 


CCATGCGGTG 


TTTAAAGACT 


TCAAACTGGA 


GTTGACCTAC 


AGCGCCTAGC 


ATGTACTCAC 


1440 


CTGTTTGGTA 


ATTCTTATAA 


AGCTGAACGG 


CTCCTTCTTG 


CACCAATTGC 


TCAATCCCCT 


1500 


TGTGGAAGGA 


TTTTTGCTTC 


ATAACATTCT 


TAGCAGAAAC 


TTTCATGAAA 


ATCTCAGGTG 


1560 


TAAAGGTTGG 


CAGGGGTTCA 


AATTCAAACT 


TGTTTTTTCC 


AACCGTCAAG 


GTATCCCCAA 


1620 


CCTGATAAGT ACCGGTATCG 


TAAACCCC6A 


TAATATCACC 


TGCCACGGCA 


TTGGTCACAT 


1680 


TCTCACGACT 


CTCCGCCATA 


AACTGGGTAA 


CATTAGATAG 


TTTAGCCCCC 


TTACCAGTAC 


1740 


GAGGGAGATT 


GACACTCATG 


CCGCGCTCAA 


ATTCGCCAGA 


TACGATACGG 


ACAAAGGCAA 


1800 


TACGGTCACG 


GTGACGAGGG 


TCCATGTTGG 


CTTGGATTTT 


AAAGACAAAG 


CCTGAGAAAT 


1860 


CCTTGTCATA 


AGGATCCACA 


ATTTCACCGT 


CTGTTTTCTT 


GTGACCATGT 


GGTTCTGGAG 


1920 


CAAACTTGAG 


GAAGGTTTCA 


AGGAAGGTCT 


GCACACCAAA 


GTTTGTCAGG 


GCTGAACCGA 


1980 


AAAAGACAGG 


CGTCAATTCT 


CCAGCCAGAA 


TAGCTTCCTC 


TGAAAACTCA 


TTCCCGGCTT 


2040 


CATTTAAAAG 


CTCAATGTCA 


TCCTTGACTT 


GCTCGTAGAA 


AGGATTGCTA 


CCAAAGAGTT 


2100 


TGTCCCCGTC TTCTAGACTG GCAAAACGCT 


CATCCCCTTT 


GTAAAGCTCT 


AAACGTTGGT 


2160 


TATAGAGGTC ATACAAGCCC 


TCAAAGGCTT 


TCCCCATCCC 


GATAGGCCAG 


TTCATAGGGT 


2220 


AGCTAGCAAT 


GCCCAAGATT 


TCTTCCAATT 


CTTGCAAGAG 


ATCCAAAGGC 


TCACGACCGT 


2280 
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CACGGTCCAG 


CTTGTTCATA AAGGTAAA6A CTGGAATGCC 


ACGATGTTTC 


ACAACCTCAA 


2340 


ACAATTTCTT 


ggtttgagcc tcgatcccct tggcagagtc 


CACGACCATG 


ACCGCAGCAT 


2400 


CCACCGCCAT 


CAAGGTACGA TAGGTATCTT CTGAGAAGTC 


CTCGTGCCCT 


GGCGTGTCTA 


2460 


AGATATTCAC 


gcgcttgccg tcgtagtcaa attgcataac 


AGATGAAGTA 


ACAGAAATCC 


2520 


CACGTTGCTT 


CTCGATATCC ATCCAGTCAG ATTTAGCAAA 


AGTCCCTGTT 


TTCTTCCCTT 


2580 


TTACCGTACC 


agcctcacga atctcacccc caaagtagag 


TAACTGCTCA 


GTGATGGTTG 


2640 


TTTTCCCCGC 


GTCCGGGTGG GAGATAATGG CAAAGGTACG 


ACGTTTCTTA 


ATTTCTTCTT 


2700 


GAATATTCAT AAGTTCTCTT TCTTTGATTC TCTATTTTTC TTGTTTCAAT AGCTGAGAAT 


2760 


GATTTTTACA TTGGATTTTA CCATTCCTTT CAACACTCCA TTATATCGGA TTTTAGCATT 


2820 


TTTTTCAATT TCTATTTCTT TTCACTTCCC CCTCCCTTAT TTATAGGAAA ATATGGTAAA 


2880 


ATAGAACAGA CTAAAAATCA TCATTTCACG AAAGGATGCA AGATGAAAAT TACGCAAGAA 


2940 


GAGGTAACAC ACGTTGCCAA TCTTTCAAAA TTAAGATTCT CTGAAGAAGA AACTGCTGCC 


3000 


TTTGCGACCA 


CCTTGTCTAA GATTGTTGAC ATGGTTGAAT 


TGCTGGGCGA 


AGTTGACACA 


3060 


ACTGGTGTCG 


CACCTACTAC GACTATGGCT GACCGCAAGA 


CTGTACTCCG 


CCCTGATGTG 


3120 


GCCGAAGAAG 


GAATAGACCG TGATCGCTTG TTTAAAAACG 


TACCTGAAAA 


AGACAACTAC 


3180 


TATATCAAGG 


TGCCAGCTAT CCTAGACAAT GGAGGAGATG 


CCTAATGACT 


TTTAACAATA 


3240 


AAACTATT6A 


AGAGTTGCAC AATCTCCTTG TCTCTAAGGA 


AATTTCTGCA 


ACAGAATTGA 


3300 


CCCAAGCAAC 


ACTTGAAAAT ATCAAGTCTC GTGAGGAAGC 


CCTCAATTCA 


TTTGTCACCA 


3360 


TCGCTGAGGA 


GCAAGCTCTT GTTCAAGCTA AAGCCATTGA 


TGAAGCTGGA 


ATT6ATGCTG 


3420 


ACAATGTCCT 


TTCAGGAATT CCACTTGCTG TTAAGGATAA 


CATCTCTACA 


GACGGTATTC 


3480 


TCACAACTGC 


TGCCTCAAAA ATGCTCTACA ACTATGAGCC 


AATCTTTGAT 


GCGACAGCTG 


3540 


TTGCCAATGC 


AAAAACCAAG GGCATGATTG TCGTTGGAAA 


GACCAACATG 


GACGAATTTG 


3600 


CTATGGGTGG 


TTCAGGTGAA ACTTCACACT ACGGAGCAAC 


TAAAAACGCT 


TGGAACCACA 


3660 


GCAAGGTTCC 


TGGTGGGTCA TCAAGTGGTT CTGCCGCAGC 


TGTAGCCTCA 


GGACAAGTTC 


3720 


GCTTGTCACT TGGTTCTGAT ACTGGTGGTT CCATCCGCCA ACCTGCTGCC TTCAACGGAA 


3780 


TCGTTGGTCT 


CAAACTAACC TACGGAACAG TTTCACGTTT 


CGGTCTCATT 


GCCTTTGGTA 


3840 


GCTCATTAGA CCAGATTGGA CCTTTTGCTC CTACTGTTAA GGAAAATGCC CTCTTGCTCA 


3900 


ACGCTATTGC 


CAGCGAAGAT GCTAAAGACT CTACTTCTGC 


TCCTGTCCGC 


ATCGCCGACT 


3960 


TTACTTCAAA AATCGGCCAA GACATCAA6G GTATGAAAAT CGCTTTGCCT AAGGAATACC 


4020 


taggcgaagg aattgatcca gagottaagg aaacaatctt aaacgcggcc aaacactttg 


4080 
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AAAAATTGGG 


TGCTATCGTC 


GAAGAAGTCA 


GCCTTCCTCA 


CTCTAAATAC 


GGTGTTGCCG 


4140 


TTTATTACAT 


CATCGCTTCA 


TCAGAAGCTT 


CATCAAACTT 


GCAACGCTTC 


GACGGTATCC 


4200 


GTTACGGCTA TCGCGCAGAA GATGCAACCA ACCTTGATGA AATCTATGTA 


AACAGCCGAA 


4260 


GCCAAGGTTT 


TGGTGAAGAG 


GTAAAACGTC 


GTATCATGCT 


GGGTACTTTC 


AGTCTTTCAT 


4320 


CAGGTTACTA 


TGATGCCTAC 


TACAAAAAGG 


CTGGTCAAGT 


CCGTACCCTC 


ATCATTCAAG 


4380 


ATTTCGAAAA 


AGTCTTCGCG 


GATTACGATT 


TGATTTTGGG 


TCCAACTGCT 


CCAAGTGTTG 


4440 


CCTATGACTT 


GGATTCTCTC 


AACCATGACC 


CAGTTGCCAT 


GTACTTAGCC 


GACCTATTGA 


4500 


CCATACCTGT 


AAACTTGGCA 


GGACTGCCTG 


GAATTTCGAT 


TCCTGCTGGA 


TTCTCTCAAG 


4560 


GTCTACCTGT 


CGGACTCCAA 


TTQATTGGTC 


CCAAGTACTC 


TGAGGAAACC 


ATTTACCAAG 


4620 


CTGCTGCTGC 


TTTTGAAGCA 


ACAACAGACT 


ACCACAAACA 


ACAACCCGTG 


ATTTTTGGAG 


4680 


GTGACAACTA ATGAACTTT6 


AAACAGTCAT 


CGGACTTGAA 


GTCCACGTAG 


AGCTCAACAC 


4740 


CAATTCAAAA 


ATCTTCTCAC 


CTACTTCTGC 


CCACTTTGGA 


AATGACCAAA 


ATGCCAACAC 


4800 


TAACGTGATT 


GACTGGTCTT 


TCCCAGGAGT 


TCTACCAGTT 


CTCAATAAAG 


GGGTTGTTGA 


4860 


TGCCGGTATC 


AAGGCTGCTC 


TTGCCCTCAA 


CATGGACATC 


CACAAAAAGA 


TGCACTTTGA 


4920 


CCGCAAGAAC 


TACTTCTATC 


CTGATAACCC 


CAAAGCCTAC 


CAAATTTCTC 


AGTTTGATGA 


4980 


ACCAATCGGA 


TATAATGGCT 


GGATTGAAGT 


CAAACTAGAA 


GACGGTACGA 


CCAAGAAAAT 


5040 


CGGTATCGAA 


CGTGCCCACC 


TAGAGGAAGA 


CGCTGGTAAA 


AACACCCATG 


GTACAGATGG 


5100 


CTACTCTTAT 


GTTGACCTCA 


ACCGCCAAGG 




ATTGAGATTG 


TATCTGAGGC 


5160 


A6ATATGC6T 


TCTCCTGAAG 


AAGCCTATGC 


TTATCTGACA 


GCCCTCAAGG 


AAGTTATCCA 


5220 


GTACGCTGGC 


ATTTCTGACG 


TTAAGATGGA 


GGAAGGTTCG 


ATGCGTGTGG 


ATGCCAACAT 


5280 


CTCCCTTCGT 


CCTTATGGTC 


AAGAGAAATT 


CGGTACCAAG 


ACTGAATT6A 


AGAACCTCAA 


5340 


CTCCTTCTCA 


AACX3TTCGTA 


AAGGTCTTGA ATACGAAGTC 


CAACGCCAGG 


CTGAAATTCT 


5400 


TCGCTCAGGT 


G6TCAAATCC 


GCCAAGAAAC 


ACGCCGTTAC 


GATGAAGCSA 


ATAAAGCAAC 


5460 


CATCCTCATG 


CGTGTCAAGG 


AAGGGGCTGC 


TGACTACCGC 


TACTTCCCAG 


AACCAGACCT 


5520 


ACCCCTCTTT 


GAAATTTCTG 


ACGAGTGGAT 


TGAGGAAATG 


CGGACTGAGT 


TGCCAGAGTT 


5580 


TCCAAAAGAA 


CGTCGTGCGC 


GTTATGTATC 


TGACCTTGGT 


TTATCAGACT 


ACGATGCTAG 


5640 


TCAGTTGACT 


GCTAATAAA6 


TCACTTCTGA CTTCTTTGAA AAAGCTGTTG 


CCCTAGGTGG 


5700 


TGATGCCAAA 


CAAGTCTCTA 


ACTGGCTCCA A6GG6AAGTC 


GCTCAGTTCT 


TGAATGCTGA 


5760 


AGGTAAAACA 


CTGGAACAAA 


TCGAATTGAC 


ACCAGAAAAC 


TTGGTTGAAA 


TGATTGCCAT 


5820 
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CATCGAAGAC 


GGTACTATTT CATCTAAGAT TGCCAAGAAA GTCTTTGTCC ATCTAGCTAA 


5680 


AAATGGCGGT 


GGCGCGCGTG AATACGTGGA AAAAGCAGGT 


ATGGTTCAAA TTTCAGATCC 


5940 


AGCTATCTTG 


ATCCCAATCA 


TCCACCAAGT 


CTTTGCCGAT 


AACGAAGCTG CTGTTGCCGA 


6000 


CTTCAAGTCA 


GGCAAACGTA 


ACGCCGACAA 


GGCtTTACAG 


GATTCCTTAT GAAGGCAACC 


6060 


AAAGGCCAAG 


CCAACCCACA 


AGTTGCCCTT 


AAACTACTTG 


CACAGGAATT 


GGCGAAGTTG 


6120 


AAAGAAAACT 


AGACAGAACA 


AAACCAGCCC 


TAAGGTTGGT 


TTTTTCTTCT 


CTACCAACTC 


6180 


CCAATAACTA 


TTTTGGCTTT 


ATTTCCAGAG 


TATTTTATGG 


TAAAATGAAG 


AGTAATAATA 


6240 


TTTATTAAAG 


AGGTAAAAAC ATGATTGAAG CAAGTACCTT AAAAGCTGGT ATGACCTTTG 


6300 


AAACAGCTGA 


CGGCAAATTG 


ATTCGCGTTT 


TGGAAGCTAG 


TCACCACAAA 


CCAGGTAAAG 


6360 


GAAACACGAT 


CATGCGTATG 


A7ATTGCGTG 


ATGTCCGTAC 


TGGTTCTACA TTTGACACAA 


6420 


GCTACCGTCC 


AGAGGAAAAA 


TTTGAACAAG 


CTATTATCGA 


GACTGTCCCA 


GCTCAATACT 


6480 


TGTACAAAAT 


GGATGACACA 


GCATACTTCA 


TGAATACAGA 


AACTTATGAC 


CAATACGAAA 


6540 


TCCCTGTAGT 


CAATGTTGAA 


AACGAATTGC 


TTTACATCCT 


TGAAAACTCT 


GATGTGAAAA 


6600 


TCCAATTCTA 


CGGAACTGAA 


GTGATCGGTG 


TCACCGTTCC 


TACTACTGTT 


GAGTTGACAG 


6660 


TTGCTGAAAC 


TCAACCATCT 


ATCAAAGGTG 


CTACTGTTAC 


AGGTTCTGGT 


AAACCAGCAA 


6720 


CGATGGAAAC 


TGGACTTGTC 


GTAAACGTTC 


CAGACTTCAT 


CGAAGCAGGA 


CAAAAACTCG 


6780 


TTATCAACAC 


TGCAGAAGGA ACTTACGTTT 


CTCGTGCCTA 


ATCTCTAGAA 


AGAGGTCATT 


6840 


CTATGGGAAT 


TGAAGAACAA 


CTTGGCGAAA 


TCGTTATCGC 


CCCACGTGTA 


CTTGAAAAAA 


6900 


TCATTGCTAT 


CGCTACTGCA 


AAGGTAGAG6 


GTGTTCACTC 


TTTTTCAAAC 


AGATCAGTGT 


6960 


CTGATACCCT 


TTCAAAACTT 


TCACTCGGCC 


GTGGCATTTA 


TCTTAAAAAC 


GTGGACGAAG 


7020 


AACTCACAGC 


AGATATCTAT 


CTCTACCTTG 


AGTACGGAGT 


AAAAGTTCCT 


AAGGTAGCGG 


7080 


TTGCTATCCA 


GAAAGCTGTC 


AAAGATGCCG 


TCCGTAATAT GGCTGATGTA GAACTCGCTG 


7140 


CTATCAATAT 


TCACGTTGCA 


GGTATCGTCC 


CAGATAAAAC 


ACCAAAACCA 


GAATTGAAAG 


7200 


ATCTATTTGA 


CGAGGACTTC 


CTCAATGACT 


AGTCCACTAT 


TAGAATCTAG 


ACGCCAACTC 


7260 


CGTAAATGCG 


CTTTTCAAGC 


TCTCATGAGC 


CTTGAGTTCG 


GTACGGATGT 


CGAAACTGCT 


7320 


TGTCGTTTCG 


CCTATACTCA 


TGATCGTGAA GATACXSGATG 


TACAACTTCC 


AGCCTTTTTG 


7380 


ATAGACCTCG 


TTTCT6GTGT 


TCAAGCTAAA AAGGAAGAAC 


TAGATAAGCA AATCACTCAG 


7440 


CATTTAAAAG 


CAGGTTGGAC 


CATTGAACGC TTAACGCTCG 


TGGAGAGAAA CCTCCrTTCGC 


7500 


TT6GGAGTCT 


TTGAAATCAC 


TTCATTTGAC 


ACTCCTCAGC 


TGGTTGCTGT 


TAATGAAGCT 


7560 


ATCGAGCTT6 


CAAAGGACTT CTCCGATCAA AAATCTGCCC GTTTTATCAA TGGACTGCTC 


7620 
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AGCCAGTTTG TAACAGAAGA ACAATAAGGC TCTTTGTCAA CTGTAGTGGG TTGAAAAAAA 7680 

GCTAAGCTCG AGAAAGGACA AATTTCGTCC TTTCTTTTTT GATGTTCAAA GCGATAAAAA 7740 

TCCGTTTTTT GAAGTTTTCA AAGTTTCGAA AACCAAAGGC ATTGCGCTTG ATAAGTTTGA 7800 

TGAGATTATT GGTCGCTTCC AGTTTGGCAT TAGAATAGTG TAGTTGAAGG GCGTTGACAA 7860 

TCTTTTCTTT ATCTTTGAGG AAGGTTTTAA AGACAGTCTG AAAAATAGGA TGAGCCTGCT 7920 

TAAGATTGTC CTCAATAAGT CCGAAAAATT TCTCTGGTTC CTTATTCTGG AAGTGAAACA 7980 

GCAAGAGCTG ATAGAGCTGA TAGTGGTGTT TCAAGTCTTG TGAATGGCTC AAAAGCTTGT 8040 

CTAAAATCTC TTTATTGGTT AAGTGCATAC GAAAAGTAGG ACGATAAAAT CGCTTATCAC 8100 

TCAGTCTACG GCTATCCTGT TGAATGAGTT TCCAGTAGCG CTTGATATCC TTGTATTCAT 8160 

GGGATTTTCG ATGAAACTGA TTCATGATTT GGACACGCAC ACGACTCATG GCACGGCTAA 8220 

GATGTTGTAC AATGTGAAAG CGATCAAGAA CGATTTTAGC ATTCGGGAGT GAAACAGTCT 8280 

GGGAGACTGT TTCAGCCTGA GCCTAGGAAT TTGAAAGCGA AGCTGTTTAG CCAAGTCATA 8340 

GTAAGGGCTA AACATATCCA TAGTAATAAT TTTGACGCGA CATCGGACAA CTCTATCGTA 8400 

GCGAAGAAAG TGATTTCGAA TGATAGCTTG TGTTCTACCC TCAAGAACAG TGATGATATT 8460 

GAGATTGTTA AAATCTTGCG CAATGAAGCT CATCTTTCCC TTTGTAAAAG CATACTCATC 8520 

CCAAGACATA ATCTCAGGAA GACAAGAAAA ATCATGTTTA AAGTGAAAAT CATTGAGCTT 8580 

ACGAATAACA GTTGAAGTTG AGATGGAAAG CTGATGGGCA ATATCAGTCA TAGAAATCTT 8640 

TTCAATCAAC TTTTGAGCAA TCTTTTGGTT GATGATACGA GGGATTTGGT GATTTTTCTT 8700 

GACGATAGAA GTTTCA6CGA CCATCATTTT TGAACAGTGA TAGCACTTGA ATCGACGCTT 8760 

TCTAAGGAGA ATTCTAGTAG GCATACCAGT CGTTTCAAGA TAAGGAATTT TAGAAGGTTT 8820 

TTGAAAGTCA TATTTCTTCA ATTGGTTTCC GCACTCAGGG CAAGATGGGG CGTCGTAGTC 8880 

CAGTTTGGCG ATGATTTCCT TGTGTGTATC CTTATTGATG ATGTCTAAAA TCTGGATATT 8940 

AGGGTCTTTA ATGTCTAGTA ATTTTGTGAT AAAAT6TAAT TGTTCCATAT GAATCTTTCT 9000 

AATGAGTTGT TTTGTCGCTT TTCATTATAG GTCATATGGG ACTTTTTTTC TACAATAAAA 9060 

TAGGCTCCAT AATATCTATA GGGGATTTAC CCACTACAAA TATTATAGAG CCAACAATAA 9120 

AAAGAAAAAG TGTTTGATAG ATATCAAACA CTTTTTTCTT TGCCTCCCAC TATCTAAAAA 9180 

AATGATAATA GATATAATTG TAAACAAAAA TCCAGATAGG TTTTGCATGA TTGAGAAAGT 9240 

TAAAAAAACt ATGGCAGAGA ATCGTTAATC TCAGATTGTC GGTAGAACGA TAAACAAGGG 9300 

CAAAAAAGAA ACCAATCAGA CTATAATATA ATAAACTAAT TGGATCTCTG TGAGATAGTA 9360 
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TCAAATGGCT AATCCCAAAG ATGATAGCAG ATAGGATAAC ATCCAAATAG TACTTGGACT 9420 

AGGGAAAGAA GGTATTCATA AAATACCCTC TATCAAGAGT CTCCTCAAAA ACAGGACCGA 9480 

TGATTACAGG CAGGACAAAA GATAAGATAG TCGATAAAAA GGTTGGTTGT CCATTT6AAA 9540 

AAAGCACGGT AAAATACTCA TCATGAATAT TCCTATGATT AATCAAATGA GCATAGCGTG 9 600 

CCCAAAAATT ACCGAGAATC TGATAAACCA CATAAGTTGC AAATAAGTAG AAGACAAATG 9660 

ACCAGTTCCA GCTCTTTTTC TCAAAGATAA AGAGCATCTT TTTCTTTTTT AACCTCCAAA 9720 

TTAATAGAAG GAAACTTCCC ACTAATCCCA TTGTTAAAAT AAGAGAATAG ACATCAGCTC 9780 

CTAACCCTAA AATGATCGTC ACATACAATC CAATTGTTTG TGGTAAATAG GTAGATAGTA 9840 

AAATAATAAG CAAAAATATT CCAAATTGTC TTAGTTTTTT TGTGTTTCTC ATCGTACTTT 9900 

TTTGAAAGAT TACCCTGCTC GGAAGCCGTA CTTCCAAGCA TCTATATAAG AATTAAGTGC 9960 

CCCTTGCCTC ATATAGGGAG CAAATTCTCT ATAATATAAC CATCTACTAT ATCCATCTTC 10020 

CCAAACAGCA AGACCACCTG AAGTTTGCTC CAAGTCCTCA GTTGAAAGAA CTGTAAATGT 10080 

ATTTGTACCT GTCATTGCAA GTACCTTCTT AAAATAGATT GTTGTAGGCT CACATTTATA 10140 

GTATATTTCT TTTTTTGTCT ATTTTATAGC CCATCTCCTC AACTGGCAAT TTTTCGACCT 10200 

GAATTACATT TTTCCATAAA AAATGAGACC TTTCTAGTCT CATTTAGTCA TTCTTAGTAT 10260 

TTTCTAAATC GTT6ATAGCG TTCTTCCAGC AACTCTTCTA GCGGTTTTTG TGAAAGTCTA 10320 

GCCAGCTCCG TTTGGAGTTC TTTTTTGACA CTCTTAATCA GTTCTTTACT AGAAAGTCCT 10380 

ATTTCAGAAA TCACXTTTATC CACCACGTCC ATTTCTAACA GTTCATGCGA AGTGATTTTC 10440 

ATCAGTTCTG CTGCTTCCAT AGCGCGAGTA CCGTCCTTCC ATAAAATGGA AGCAAAGCCT 10500 

TCTGGACTGA GAATGGCATA GATAGAATTT TCCAGCATCC AGACACGGTC CGCGACAGCT 10560 

AGAGCCAGAG CCCCGCCTGA AOCACCTTCA CCGATAATAA TGGCQATAAT AGGAACTTTC 10620 

AGGTCACTCA TTTCCATGAG ATTGCGAGCG ATAGCTTCCC CTTGACCACG TTCTTCCGCr 10680 

CCGACACCAG GATAAGCACC TGCTGTATTG ATAAAGGTCA CAACTGGACG GCCAAATTTC 10740 

TCAGCCTGTT TCATCAACCG CAGTGCCTTT CGGTAGCCTT CTGGATGTGG TTGGCCAAAA 1080 Q 

TTCCGTTTGA GGTTGTCTTG CAAACTCTTG CCTTTTTGGA TACCAACCAC TGTTACAGCT 10860 

TGGTCTCCAA GCCAACCAAT ACCACCAACA ACTGCACCAT CATCACGAAA AGAACGGTCA 10920 

CCATGTAATT GGATAAATTC ATCAAAAATG CCTGTCGCAA AGTCCAAGGT TGTCAAGCGA 10980 

CTCTGCTCAC GCGCTTCTCT GACTATTTTT GCAATATTCA TCTAGGACTC CCTCCATGCA 11040 

ATCTGACTAG GCTAGCAATC GTATCTGGTA AGTCTCTTCT TTTGACAATA GCATCCACAA 11100 

AGCCATGTTC TAATAGGAAT TCTGCCTTTT GGAAATCCTC AGGCAAGCTT TCACGAACCG 11160 
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TATTTTCAAT CACACGACGC CCAGCAAAAC CAACCAAGCT CTGTGGTTCA GCCAGAATGA 11220 

TATCGCCTTC CATAGCGAAA GAAGCTGTCA CACCACCAGT CGTTGGATCT GTCAAAATGG 11280 

TCAGGTAAAA GAGACCAGCA TTTGAATGGC GTTTAACCGC CGCAGAGATC TTAGCCATCT 11340 

GCATGAGACT CATGATTCCT TCCTGCATAC GGGCTCCACC AGAGGCTGTG AATAGGACAA 11400 

CTGGCAATTT TTCGACAGTC GCATACTCAA ACAAACGAGT GATTTTTTCA CCTACAACCG 11460 

TACCCATAGA AGCCATGATA AAGTTAGAAT CCATAATCCC AAGAGCCACA GTCTGACCTT 11520 

TAATAAGAGC AGTTCCTGTC ACAACGGCTT CATGCAGACC TGTTTTTTCA CGCATAGATG 11580 

CCAGTTTCTT TTGGTAACCA GGGAAATGCA AGGGATCCTT GCTTTCAATC CCTGTAAACA 11640 

ATTCTTTGAA GGTTCCCATA TCAATCGTCA AAGCCAAGCG TTCTTGGGCA GAAATACGAA 11700 

AGGTATAGCT ACAGTGCGGA CAGATACGTT CACTTCCCAG ATCCTTCTGA TAGATGGTAT 11760 

GCTTACAGCC TGGACACTGG GAAAATAATT CATCTGGAAC CTCTGGCTTA GCTTGAGGTT 11820 

TTTCCCTAAC CGAACGATTG GGATTGATTC GAATATACTT ATCTTTTTTA CTAAATAGAG 11880 

CCATTGATTC CCCTTTTCGG TTTAAACTCT TAAAGTCATT TTATTCTTTT TCTTGATATT 11940 

TAGGTAAGAA GGTTTCCATC AAGAAGGAAG TATCATAATC CCCAGCAATG ACATTGCGAT 12000 

CTGAAATGAG GTCAAGCTGG AAATCTGCAT TGGTCTGCAC TCCTTCAATT TCTAATTCAT 12060 

AGAGGGCACG TTGCATTTTC ATCAAGGCGT CAAAACGATT TTCGCCGTGT ACTATGATTT 12120 

TGGC7VATCAT ACTATCATAA TAAGGCGGAA TGGTATAACC TGGATAAACT GCTGAATCCA 12180 

CGCGCAAGCC AACTCCACCA CTTGGCAGAT AGAGATTAGT AATCTTACCT GGACTTGGAG 12240 

CAAAGTTAAA GGCTGGGTTT TCTGCATTGA TACGACACTC GATGGCATGA CCGCGTAGGA 12300 

CAATATCTTC TTGCTTAACA GACAAAGGCT GACCTGCCGC AATGCAAATC TGTTCCTTAA 12360 

CGATATCAAC ACCTGAAACA AACTCTGTTA CTGGATGTTC TACCTGAACA CGAGTATTCA 12420 

TCTCCATGAA ATAGAAATTG CTACTTGCTT CATCAAGAAG AAATTCAATG GTTCCTGCAT 12480 

TCTCATAGCC AACAAACTCT GCCGCTCGAA CAGCAGCAGC ACCTATTTCA TGACGCAGCG 12540 

TTTTTCCGAT TGCAATCGAG GGACTTTCTT CCAAAACCTT TTGGTTATTC CTTTGAAGAG 12600 

AACAATCCCG TTCACCCAAG TGAATCACAT GTCCATGCTC ATCACCTAGG ATTTGAACCT 12660 

CAATGTGCCG AGCTGGATAG ATAACCCGTT CTATGTACAT GGCACCATTG CCATAATTGG 12720 

CCTTGGCCTC ACTAGAGGCA GTTTCAAAGG CAGAAACGAG GTCATCTGGT TTTTCAACCT 12780 

TACGAATCCC TTTACCACCT CCACCTGCTG AAGCCTTGAG CATAACAGGA TAGCCAATTT 12840 

TTTCAGCAAC AATCAAAGCT TCTTCAGAGT TATGCACTTC TCCATCTGAA CCTGGTATAA 12900 
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CAGGCACACC TGCTTTAATC 
TAACATGACC AGATGGACCG 
TGG7UVTTTTC ACTGAGAAAT 
CAGCTGATAG AACTGCATTA 
AAACTGCTTC ATCTGCCAAA 
CTACCGTCGC AATCCCCAAT 
GATTGGCAAT TAAAATTTTT 
AAGGGTACCA CTGGCTGCAA 
GGTGCCACGA CGTTTTACAA 
CTTCTTGAAC TTAACCTTGT 
TTTTGATAAC TCCAACACAC 
CATAACTGGG TATTGAGGAA 
GATAGCAACA ATGGTATCCT 
ACGGTGGGGA AGAGCTTCTT 
TTACCAAACT CAACCATTTC 
GGAGCTGGGA TTTCATTCAT 
TTGACACTAT CACCAACTGT 
ACTCCAACAA GTGGACTCTC 
GCTGGAACTT CTTCTGCTAC 
GTTGCTAGAA CXSGGTGCTGG 
TTCTTGCTAA ACTGCAACTC 
TGGTCAAATT GAGTCATCAA 
AACGTTTGAA AGCAAGAACT 
ATGGAATTTC TTTCTCCAAG 
CTTCACTTGT CCCAGCTGTC 
TAGCTTCTAC TGCACCCGCA 
CAGGTACTTC CTTACCAAGA 
GAGTTGACGT TCCGTGAGCA 
CCAAGGCTAG TTTGATGGCX: 
GGTAGGCATC ACAAGTATTT 



202 



ATCTGAGCAC GCGCATTGAT 


CTTATCCCCC 


ATCATATCCA 


12960 


ATAAACTTGA TACCTACTTC 


TTCACACATG 


GTCGCAAATT 


13020 


CCAAAACCAG GGTGAATAGC 


TTCTGCCTCA GTCAAGACTG 


13080 


ATATTGAGAT AAGACTCTGT 


TGCCTTGCCA 


GGACCAATAC 


13140 


AGCGTATGAA GAGCTTCCTT 


ATCAGCAGTT 


GAATAAACCG 


13200 


TCACX5TGCCG CACGGATAAT 


ACGAACCGCA 


ATTTCACCAC 


13260 


CGAAACATGG AGAACCTCCT 


TAGTTCCCAA 


TTGCAAAAGT 


13320 


GCTTGCCATC CACTTCAGCC 


TTTGCTTCAA 


CCACAGCTAT 


13380 


AAGTCGCTGT CATAACCAAT 


TGGTCGCCTG GTACAACTTG 


13440 


CCATACCAGC GTAAAAGACC 


AGTTTTCCTT 


TATTTTCAGG 


13S00 


CGGCAGTTTG CGCCAAGGCT 


TCCATAATCA 


CAACACCTGG 


13560 


AGTGGCCGTT AAAGAAAGGC 


TCGTTGATGG 


TCACATTTTT 


13620 


CGCTCACTTC CAAGAC7VCGG 


TCCACTAGAA 


GCATAGGATA 


13680 


TGATTCCTTG AATATCGATC 


ATTTGATACG 


TACCAATCCT 


13740 


TTCGTTAGAG ACGAGAATTT 


CCGTTACCAC 


ACCATCCTTA 


13800 


GACTTTCATG GCTTCGATAA 


TTACCAATGT 


TTGACCTTTT 


13860 


AACGAAGGCA GGTTTATCTG 


GTCCAGCAGC 


CAAGTAAACC 


13920 


TACAAGATTT CCCTCAGTAG 


CCACACTTGC 


TTCAGCTGGA 


13980 


A6TCTCTGCT GGAGCAGATG 


TAGGAGCTAC 


TGGACTCGGT 


14040 


AGCGACTTGA GTTGCAACTT 


CAGGCACAGG 


TCTTGCTTCA 


14100 


ATCCGTCCCA TTTTTATAAG 


AAAATTCTCT 


CAAACTTGAC 


14160 


GTCTTTAATA TCGTTTAAAT 


TCATACTTAT 


CTATTCTCCC 


14220 


GCATTGTGGC CTCCAAAACC 


AAAAGTATTT 


GAAATAGCGT 


14280 


CCTTGTCCAT AAACGACATT 


AGCTTCGATA 


TAATCTGATA 


14340 


ATTGGTACAA AGTTATGACG 


CATAGCTTCG ATGGTGACGA 


14400 


GCCCCCAGCA AATGTCCTGT 


AAAAGACTTG 


GTTGATGATA 


14460 


ACAGCTACm TAGCACCACT 


TTCTCCTTTT 


TCATTGGCAG 


14520 


TT6ACATAGG CTACTTGCTC 


TGGAGAAATC 


TCAGCTTCTT 


14580 


TTGATAGCTC CCTGACCTTC 


TGGATGTGGA 


GAAGTCATGT 


14640 


CC6TAACCAA CXy^CTTCAGC 


CAGGATAGTA 


GCTCCACGTT 


14700 
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TTTCAGCGTG TTCAAGACTT TCTAGAACCA ACATCCCTGA ACCTTCACCC ATAACAAACC 14760 

CATTGCGATC CTTATCAAAT GGGATCXSAAG CACGAGTTGG ATCCTCTGTA GTAGAGAGAG 14820 

CTGTTAAGGC TTGGAAACCA GCGATGGCAA AAGGTGTGAT AGAAGCTTCT GTTCCTCCCA 148B0 

CCAACATCAC ATCTTGGAAA CCAAACTTAA TGGAGCGGAA GGCATCCCCA ATCGCATCAT 14940 

TTGATGAAGA GCAGGCAGTA TTGATAGATT TACAAACACC GTTTGCACCA AAACGCATGG 15000 

CTACATTCCC AGAAGCCATA TTTGGTAAAG CTTTTGGAAG AGTCATTGGT TTGACACGTT 15060 

TGGGTCCTTT TTCATGAAGG CGAAGTACCT GATCTTCAAT TTCCTTGATT CCACCAATAC 15120 

CAGATGCAAC GATAACACCA AAACGATCCC TATTAAGAGC CTCTACATCA AGATTGGCAT 15180 

GATTTACAGC CTCTTGGGCT GCATACAAGG CATATAAAGA ATAGTTATCA AAACGGTTGG 15240 

TATCTTTTTT TACAAA6TAT TTATCGAACG GAAAATCTTG GATTTCTGCC GCATTATGCA 15300 

CATCAAAGTC ACTATGATCA AATTTTGTAA TGCXIACCAAT GCCGATTTTC CCAGTTGCTA 15360 

AACTATTCCA. AAATTCTTCT GGTGTATTTC CGATTGGAGA TGTTACTCCA TAACCTGTTA 15420 

CCACTACTCG ATTTAGTTTC ATTCTTTTCA CCTCTAGCTT TCGCTACATA CTTAAGCCAC 15480 

CATCAATGGC AACCACTTGT CCAGTTAGAT AATCTTGGCC TGCTAAAAAT ACTGTCAAAT 15540 

CTGCAACCTG CTCTGCCTGC CCAAATTCTT TCATCGGAAT CTGAGCTAGT GTAGCTTCCT 15600 

TAATCTTATC TGACAGGATA GCGGTCATAT CAGACTCAAT CATTCCTGGA GCAATCACAT 15660 

TGACTCGTAT ATTCCGACTA GCGACCTCGC GTGCCACAGA CTTGGTAAAG CCAATCAAGC 15720 

CAGCCTTAGA A6CAGCATAA TTAGCTTGAC CAATATTCCC CATCAAACCA ACAACACTAG 15780 

ACATATTAAT GATAGCACCT TCTCTGGCTT TCATCATCGG TTTCAAGACT GATTGTGTCA 15840 

TATTAAAGGC ACCAGTCAGA TTGACCTTGA GCACTTTTTC AAAATCTGCT TCTGTCATCT 15900 

TGAGCATAAG AGTATCTTGG GTAATCCCTG CATTGTTGAC CAAAACATCT ACTGAACCCA 15960 

GTTCTGCAAT AGCTTGATCA ATCATACGCT TAGCGTCTGC AAAATCTGAT ACATCTCCTG 16020 

AAATGGGAAC CACCTTGATA CCATAGTTTG AAAACTCAGC GAGCAATTCT TCTGAGATTG 16080 

CCCCACGACT GTTTAAGACA ATGTTGGCTC CTGCTTGAGC AAACTTGTGG GCGATGGCAA 16140 

GACCAATTCC ACGACTCGAA CCTGTAATAA AGATATTTTT ATGTTCTAGT TTCATTTTTT 16200 

TCCTTTCAAA ACTTCTACTT ATTTTAGTCT ATTTTTCTAA AAGTGCTACT AAACTCGCTT 16260 

GATCTTCCAC ATGAGCTAAG TGAGCAGTTT GATCAATTTT TTTAACAAAA CCTGACAAGA 16320 

CTTTCCCCGG TCCAATCTCG ATAAAGTTGC TTATGCCTGC TTCTTGCATG ACCCCAATAC 16380 

TTTCATA6AA ACGAACGGGT TCCTTGACCT GACGCGTCAA GAGCTGAfiCA ATGTCCTCTT 16440 
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TTTGCATCAC 


AGCAGCTTCT GTATTGCCGA CTAGGGGACA AGTAAAATCT GAAAAACTTA 


16500 


CCTGAGCTAG 


AGTTTCAGCT AGTTTCTGGC 


TAGCAGGTTC 


AAGGAGAGCG GTGTGAAAGG 


16560 


GACCTGACAC 


CTTAAGAGGA 


ATCAAGCGTT 


TGGCACCTGC 


TTCTTGCAAA AGTTCAACCG 


16620 


CTCGATCAAC 


TGCAACCACT 


TCTCCAGCAA 


TGACGATTTG 


TGCAGGTGTG TTATAGTTGG 


16680 


CTGGAGTAAC 


CACTCCAAGT 


TCAGAAGCTT 


TTTGACAGGC 


TTCTTCAATG ACCTCTACTG 


16740 


GCGTATTGAG 


AACTGCTACC 


ATCTTGCCAG 


AGTCAGCAGG 


AGCCGCTTCT TCCATATAGG 


16800 


CTCCACGCTT 


AGCTACCAAG 


GCAACCGCAT 


CTTCAAAATC 


CAAGGCGCCA CTTGCCACCA 


16860 


AGGCAGAGTA 


TTCTCCAAGA 


GACAAACCAG 


CAACCATATC AGGCTGATAG CCCTTTTCTT 


16920 


GCAATAAACG 


GTAGATAGCA ACCGAAGTCG 


CTAGAATGGC 


TGGTTGCGTA TAGCGGGTCT 


16980 


GATTGAGTTT 


GTCTTCTTCC GTATCGATGA GATAACGCAA ATCATAACCG AGCACCTGGC 


17040 


TCGCTCGATC 


AATCGTTTCT 


TTAACAATCG 


GATACTGATC 


ATAGAAATCC CGTCCCATCC 


17100 


CTAGATACTG 


GGCACCTTGA 


CCAGCAAATA 


AAAAGGCTGT 


TTTAGTCATT TCTTACAACT 


17160 


CCTGTCCAGC 


GAGAGGCTTC 


TTCTTGAATT 


TTCTTAGCGG 


CTCCGTAATA CAAATCTTTT 


17220 


AGGATTTCTT 


CAGCTGTTTC 


TTCTTTAGAA 


ACAAGCCCTG 


CGATTTGACC TGCCATAACA 


17280 


GAGCCACCAT 


CCACATCACC 


GTGAACAACT 


GCTTTGGCTA 


GAGCACCTGC TCCCATTTGT 


17340 


TCAAAGATTT 


CTAAATCAGG 


ATCTTCTTGC 


TTAAAGGCAT 


CTTTTTCAGC CAGTTCAAAA 


17400 


TCTCTAGTCA 


ACTGATTTTT 


aatagcacx;a 


ACAGCATGAC 


CAAAGTGCTG AGCTGAAATC 


17460 


GTAGTATCAA 


TATCCCTTGC 


TTTTAAAATT 


TTCTCCTTGT 


AGTTTCGATG GGCATTCGAC 


17520 


TCTTTTGCAA 


CTACAAACC6 


TGTCCCCACC 


TGTACAGCCT 


CTGCACCTAG CATAAAGCCA 


17580 


GCCGCAGCAC 


CTTCACCATC 


CGCAATTCCT 


CCTGCAGCAA TAACAGGAAT AGATATAGCT 


17640 


GTGGCTACCT 


6TCGCACCAA 


G6TCATG6TT 


GTTAATTTAC 


C6ATATGCCC CCCAGCTTCC 


17700 


ATTCCTTCTG 


CAATAACAGC 


GTCTGCACCG 


ATTTTTTCCA 


TGCGTTTAGC TAAAGCGACA 


17760 


CTAGGAACAA 


CAGGAATAAC 


GATTATCCCA 


GCTTCATGGA 


AACGTTCCAT ATACTTGCTT 


17820 


GGATTTCCTG 


CTCCTGTTGT 


GACAACTTTA ACACCTTCTT 


CAATAACGAG ATCCACGATG 


17880 


TCTTCCACAA 


AGGGAGATAA GAGCATGATG TTGACCCCAA AGGGTTTATC AGTCAATGAT 


17940 


TTGATTTTAT 


CAATATTGGC 


CTTGACAACT 


TCTTTCGGGG 


CATTTCCCCC ACCXSATAATT 


18000 


CCTAATCCTC 


CAGCCTTGGA AACAGCCCCT 


GCCAAATCAC 


CATCAGCAAC CCAGGCCATC 


18060 


CCTCCTTGGA 


AAATA66ATA ATCAATCTTC 


AATAATTCTG 


TAATACGCGT TTTCATAGTG 


18120 


CCTCCAACCT 


TCCTTGCTTA 


CGTAATAGTT 


CGATTTCACC ATAATTTGAC AGTCAAACTA 


18180 


TTACCTAAAC 


AAGAGGGA6T 


GGGTTTCTCC 


CTACTCCTTC 


TACTAATATT CTGCTTATTT 


18240 
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TGCTTGCTCT TCAACGTAAG CAACCAAGTC ACCAACTGTT TTCAAGTCAT TTTCTGCTTC 18300 

GATTTGGATA TCAAAAGCAT CTTCGATTTC TGAGATTACT TGGAACAAGT CCAATGAATC 18360 

TGCGTCCAAA TCATCAAAAG TTGATTCAAG TGTTACTTCT GATGCGTCTT TTCCAAGTTC 18420 

TTCAACGATA ATTTCTTGTA CTTTTTCAAA TACTGCCATG ATAGGACTCC TTTAAAATAA 18480 

ATAGTTTTTT TATAACAATG TGTTCACCAC ATGATTACCT AAATTGTAAG AATGAGCGTG 18540 

CCCCAGGTCA AGCCTCCACC GAAGCCTGAT AGAAGAACAG TCTGGCTACC ATCTAAAGGG 18600 

ATGAGACCTT GTTCTACACA CTCTGAAAGT AAAATCGGGA TACTGGCTGC ACTGGTATTG 18660 

CCATATTCCA TCATATTGGC TGGAAGTTTG GCTCGGTCAA CACCAATTTT TCTAGCCATC 18720 

TTATCCAAAA TACGGTCATT GGCTTGATGA AGTAGCAGAT AATCCAAGTC TGTCACCTCT 18780 

ATAGGAGATT CATCAATAGT CTGCTTGATA GACTTGGCTA CATCTCGAAT GGCAAAATCA 18840 

AAGACTGTGC GTCCATCCAT CTTCAAAAAC GAATCTGCAC TTTCTTGATC TGAAAATGGA 18900 

GAATGTAAAC CTGAATGCCC ATAAGTTAAA CACTCGCTGC GACTTCCATC GCTATTGAGA 18960 

CTCTCAGCTA AGAAATGCTC TTQCTCGCTA GCTTCTAACA AGACACCACC AGCACCATCT 19020 

CCAAACAACA CAGCTGTTGA TCGATCCGAC CAATCXSACTG CCTTAGAGAG GGTTTCACTA 19080 

CCAATCACCA AGCCTTTTTG AAAGCGACCA GAAGCGATAA ACTTTTCAGC AGTTGAAAGA 19140 

GCAAATACAA ATCCACTGCA AGCCGCGGTT AAGTCAAAAG CAAAGGCTTT ATTAGCACCA 19200 

ATATTAGCTT GAACACGAGC AGCTGTAGAG GGCATCATCG AATCTGGAGT AATGGTAGCT 19260 

AGGATGATAA AATCCAGTTC TTCTCCTGTT ATTCCAGCTT TTGCCATCAG TTTCTTAGCA 19320 

ACCTCTGTAG CCAAATCACT GGTAGATTCT GTTCTTGAAA TATGCCTTTG TCGTATTCCC 19380 

GTTCGACTTG AAATCCACTC ATCATTGGTA TCCATAATCT GAGCCAAGTC GTGATTTGTA 19440 

ACCACTTGCT CTGGCACATA ATGAGCAACC TGACTTATTT TTGCAAAAGC CATTATTTCA 19500 

AATCCTCCAA AAATTGGTAA AGATTAGTCA AACCTTTACC CATGACAGCA ATTTCTTCCT 19560 

CGCTCATGCC ATCAATAATT TTTTCTACCA TGGCCTTGTO GAAGCGTTTA TGCAGTCTAT 19620 

GAATCAAGCG ACCCTTCTTT GTCAAATGCA GATGCACCAC ACGACGATCC TGTTCTGACC 19680 

GAACTCGCTC AATGTAGCCC GG 19702 
(2) INFORMATION FOR SEQ ID NO: 8: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6211 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



GAAAATTTCC 


TCTCTTCTCT 


TGAAAAATTT 


TGAAAAAATG 


GTATGATAGT 


AACAAGTTAT 


60 


TTTTAAGAGG 


AAAGAAAGGG 


GAATAATGGA 


GAAAATCAGT 


TTAGAATCTC 


CTAAGACGGG 


120 


GTCGGACCTA 


GTTTTGGAAA 


CACTTCGTGA 


TTTAGGAGTT 


GATACCATCT 


TTGGTTATCC 


180 


TGGTGGTGCG GTTTTGCCTT TTTATGATGC GATATATAAT TTTAAAGGCA 


TTCGCCACAT 


240 


TCTAGGGCGC 


CATGAGCAAG 


GTTGTTTGCA 


TGAAGCTGAA 


GGTTATGCCA 


AATCAACTGG 


300 


AT^GTTGGGT GTTGCCGTCG TCACTAGTGG ACCAGGAGCA ACAAATGCCA 


TTACAGGGAT 


360 


TGCGGATGCC ATGAGCGATA GCGTTCCCCT TTTX3GTCTTT ACAGGTCAGG IXSGCGCGAGC 


420 


AGGGATTGGG AAGGATGCCT TTCAGGAGGC AGACATCGTG GGAATTACCA TGCCAATCAC 


480 


TAAGTACAAT 


TACCAAGTTC 


GTGAGACAGC 


TGATATTCCG 


CGTATCATTA 


CGGAAGCTGT 


540 


CCATATCGCA 


ACTACAGGCC 


GTCCAGGGCC 


AGTTGTAATT 


GACCTACCAA 


AAGACATATC 


600 


TGCTTTAGAA 


ACAGACTTCA 


TTTATTCACC 


AGAAGTGAAT 


TTACCAAGTT 


ATCAGCCGAC 


660 


TCTTGAGCCG 


AATGATATGC 


AAATCAAGAA 


AATCTTGAAG 


CAATTGTCCA 


AGGCTAAAAA 


720 


GCCAGTCTTG 


TTAGCTGGT6 


GTGGAATTAG 


TTATGCTGAG 


GCTGCTACGG 


AACTAAATGA 


780 


ATTTGCAGAA 


OGCTATCAAA 


TTCCAGTGGT 


AACCAGTCrr 


TTGGGACAAG 


GAACGATTGC 


840 


AACGA6TCAC 


CCACTCTTTC 


TTGGAAT6GG 


AGGCATGCAC 


GGGTCATTCG 


CAGCAAATAT 


900 


TGCTATGACG 


GAAGCGGACT 


TTAT6ATTAG 


TATTGGTTCT 


CGTTTCGATG 


ACCGTTTGAC 


960 


GGGGAATCCT 


AAGACTTTCG 


CTAAGAATGC 


TAAGGTTGCC 


CACATTGATA 


TTGACCCAGC 


1020 


TGAGATTGGC 


AAGATTATCA GTGCAGACAT TCCTGTAGTT GGAGATGCTA AGAAGGCCTT 


1080 


GCAAATGTTG 


CTAGCAGAAC 


CAACAGTTCA 


CAACAACACT 


GAAAAGTGGA 


TTGAGAAAGT 


1140 


CACTAAAGAC AAGAATCGTG 


TTCGTTCTTA 


TGATAAGAAA 


GAGCGTGTGG 


TTCAACCGCA 


1200 


AGCAGTTATT 


GAAC6AATTG 


GTGAATTGAC 


GAATGGAGAT 


GCCATTGTGG 


TAACAGACGT 


1260 


TGGTCAACAC CAAATGTGGA CAGCTCAGTA TTATCCCTAC 


CAAAATGAAC GTCAGTTAGT 


1320 


GACTTCAGGT GGTTTGGGAA CAATGGGCTT TGGAATTCCA GCAGCAATCG GTGCTAAAAT 


1380 


TGCTAACCCA GATAAGGAAG 


TAGTCTTGTT 


TGTTG6GGAT 


GGTGQTTTCC AAATGACCAA 


1440 


CCAGGAGTTG GCTATTTTGA ATATTTACAA GGTGCCAATC AAGGTGGTTA TGCTGAACAA 


1500 


TCATTCACTT 


GGAATGGTTC 


GCCAGTGGCA GGAATCCTTC 


TATGAAGGCA 


GAACATCAGA 


1560 


GTCGGTCTTT 


GATACCCTTC 


CTGATTTCCA ATTGATGGCG 


CAGGCTTATG GTATTAAAAA 


1620 


CTATAAGTTT GACAATCCTG AGACCTTGGC TCAAGACCTT GAAGTCATCA CTGAGGATGT 


1680 
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TCCTATGCTA ATTGAGGTAG ATATTTCTCG TAAGGAACAG GTGTTACCAA TGGTACCGGC 1740 

TQGTAAGAGT AATCATGAGA TGTTGGGGGT GCAGTTCCAT GCGTAGAATG TTAACAGCAA 1800 

AACTACAAAA TCGTTCAGGA GTCCTCAATC GCTTTACAGG TGTCCTATCT CGTCGTCAGG 1860 

TTAATATTGA AAGCATCTCT GTTGGAGCAA CAGAAGATCC GAATGTATCG CX5TATCACTA 1920 

TTATTATTGA TGTTGCTTCT CATGATGAAG TGGAGCAAAT CATCAAACAG CTCAATCGTC 1980 

AGATTGATGT GATTCGCATT CGAGATATTA CAGACAAGCC TCATTTGGAG CGCGAGGTGA 2040 

TTTTGGTTAA GATGTCAGCG CCAGCTGAGA AGAGAGCTGA GATTTTAGCG ATTATTCAAC 2100 

CTTTCCGTGC AACAGTAGTA GACGTAGCGC CAAGCTCGAT TACCATTCAG ATGACGGGAA 2160 

ATGCAGAAAA GAGCGAAGCC CTATTGCGAG TCATTCGCCC ATACGGTATT CGCAATATTG 2220 

CTCGAACGG6 TGCAACTGGA TTTACCC6CG ATTAAAAATC CAACTTAAAT TTATTAAACC 2280 

AGCCTAAAAG GCAATAAATA ATAGAAAAGA GAGAAAAGCT ATGACAOTTC AAATGGAATA 2340 

TGAAAAAGAT GTTAAAGTAG CAGCACTTGA CGGTAAAAAA ATCGCCGTTA TCGGTTATGG 2400 

TTCACAAGGG CATGCGCATG CTCAAAACTT GCGTGATTCA GGTCGTGACG TTATTATCGG 2460 

TGTACGTCCA GGTAAATCTT TTGATAAAGC AAAAGAAGAT GGATTTGATA CTTACACAGT 2520 

AGCAGAAGCT ACTAAGTTGG CTGATGTTAT CATGATCTTG GCGCCAGACG AAATTCAACA 2580 

AGAATTGTAC GAAGCAGAAA TCGCTCCAAA CTTGGAAGCT GGAAACX3CAG TTGGATTTGC 2640 

CCATGGTTTC AACATCCACT TTGAATTTAT CAAAGTTCCT GCGGATGTAG ATGTCTTCAT 2700 

GTGTGCTCCT AAAGGACCAG GACACTTGGT ACGTCXSTACT TACGAAGAAG GATTTGGTGT 2760 

TCCAGCTCTT TATGCAGTAT ACCAAGATGC AACAGGAAAT GCTAAAAACA TTGCTATGGA 2820 

CTGGTGTAAA GGTGTTGGAG CGGCTCGTGT AGGTCTTCTT GAAACAACTT ACAAAGAAGA 2880 

AACTGAAGAA 6ATTTGTTTG . GTGAACAAGC TGTACTTTGT GGTGGTTTGA CTGCCCTTAT 2940 

CGAAGCAGGT TTCGAAGTCT TGACAGAAGC AGGTTACGCT CCAGAATTGG CTTACTTTGA 3000 

AGTTCTTCAC 6AAATGAAAT TGATCGTTGA CTTGATCTAC GAAGGTGGAT TCAAGAAAAT 3060 

GCGTCAATCT ATTTCAAACA CTGCTGAATA CGGTGACTAT GTATCAGGTC CACGTGTAAT 3120 

CACTGAACAA GTTAAAGAAA ATATGAAGGC TGTCTTGGCA GACATCCAAA ATGGTAAATT 3180 

TGCAAATGAC TTTGTAAATG ACTATAAAGC TGGACGTCCA AAATTGACTG CTTACCGTGA 3240 

ACAAGCAGCT AACCTTGAAA TTGAAAAAGT TGGTGCAGAA TTGCGTAAAG CAATGCCATT 3300 

CGTTG6TAAA AACGACX3ATG ATGCATTCAA AATCTATAAC TAATTAGAAA TATATAGCGC 3360 

TGGAGATGAT TTTATGAAAA AGATTATGAG AAAAATTGCA TCGTTATTAT TGGTTCTAGT 3420 
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TGTATAATGT AATTACACCG TCGGTAATAG TGCTAGCAGA CCAAAATAAA GCAGATTGGT 3480 

CGTATGATGA AAATGCTGTA ATTAACATTT ATGATGATGC TAATTTTGAA GATGGTAGGT 3540 

TGCATATGAA CTTTGAACAA TTCTTCAAAT TGGCACAAAT AGCTAGAGAA GAAGGTCTTG 3600 

AAATTCATTC TCCGTTTGAG AGAGCTGGTG CGACTAAATC TGCTCGTTAT ATAGCGAAAT 3660 

GGATTTTGAG AAATAAAAAA CATTAACAAA TATAGTTGGT AAATCATTAG GACCTAAATC 3720 

AGCTGTTAGA TTCGGAGAAG CTTTATCCTA TATTGAAGGT CCTCTTCGCA GAATAAATGA 3780 

GACGATAGAT GGCGGTTTAT ATCAAATAGA GCAAATTATT GCATCTGGAT TGAAAGAATC 3840 

GGGTTTAAAT GACTGGACTG CGAAAACTTT AGCTTCAGCT ATTCGTGGGA TATTAGATGT 3900 

ACTTATTTAG GGGTTGAAAT CATATGAATA TTACCAATTT GTTTTCTATC AAGACAGGAT 3960 

GTGATGAAAC TGATAGGCAA CTGCAAAAAC TATTTTTTCA GTTGGATTTA CAATTGGGAG 4020 

AATTGACAGA TCAACTAA6A AAATTAGATT CTAATTTTGT TCCTCGTAGT CAATTTGTAG 4080 

ACACGTTGGA TTTGAATGAT GTAGAATATA AAGAAATTTT AAACTATTTT ATCTTCCATC 4140 

GTAATGATAG TGAAGAAAGT TTGGTAGAAT GGTTATATGA TTGGATTTCC ACAAATCGTT 4200 

ATGAACTTCC TAAAGAGTTT TCGATTCGTA TGGCTCATAA ATACCATGAA AGTGTTACTG 4260 

AAGTTTTCGG AGATGAATAA CTAAAAAACA GTCATTAGTG ACTGTTTTTT ATAGAAAAAG 4320 

AGGTTTTATA TGTTAAGTTC AAAAGATATA ATCAAGGCTC ACAAGGTCTT GAACGGTGTG 4380 

GTTGTGAATA CTCCACTGGA TTACGATCAT TATTTATCGG AGAAGTATGG TGCTAAGATT 4440 

TATTTGAAAA AAGAAAATGC CCAGCGTGTT CGCTCCTTTA AAATTCGTGG TGCCTATTAT 4500 

GCCATTTCCC AGCTCAGCAA GGAAGAACGT GAACGTGGGG TAGTCTGCGC TTCTGCGGGA 4560 

AATCATGCGC AGGGAGTAGC CTATACTTGT AATGAAATGA AAATTCCTGC TACTATCTTT 4620 

ATGCCCATTA CTACGCCACA ACAAAAGATT GGTCAGGTTC GCTTTTTTGG TGGGGATTTT 4680 

GTAACTATTA AACTAGTTGG AGATACCTTT GATGCCTCAG CCAAAGCAGC TCAAGAATTT 4740 

ACAGTCTCTG AAAATCGTAC CTTTATTGAT CCTTTTGATG ATGCTCATGT TCAAGCAGGT 4800 

CAAGGAACAG TTGCTTATGA GATTTTAGAA GAAGCTCGAA AAGAATCGAT TGATTTTGAT 4860 

GCTGTCTTGG TTCCTGTTGG TGGTGGCGGT CTCATTGCCG GGGTTTCTAC CTATATCAAG 4920 

GAAACAAGTC CAGAGATTGA G6TTATCGGA GTAGAGGCGA ATGGAGCGCG TTCCATGAAA 4980 

GCTGCCTTTG AGGCTGGAG6 TCCAGTAAAA CTCAAGGAAA TTGATAAATT TGCTGATGGG 5040 

ATTGCTGTGC AAAAGGTAGG TCAGTTGACC TATGAAGCAA CTCGTCAACA TATTAAAACT 5100 

TTGGTAGGTG TCGATGAGG6 ATTGATTTCT GAAACCTTGA TTGACCTTTA CTCTAAGCAA 5160 

GGGATAGTCG CAGAACCTGC TGGAGCGGCT AGTATCGCCT CTTTAGAGGT TTTAGCTGAA 5220 
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TATATTAAGG GGAAAACCAT TTGTTGTATC ATTTCTGGAG GAAATAATGA TATCAACCGT 5280 

ATGCCAGAAA TGGAAGAGCG TGCCTTGATT TATGATGGTA TCAAACATTA CTTTGTGGTC 5340 

AATTTCCCAC AACGTCCAGG AGCTTTGCGT GAGTTTGTAA ATGATATCCT GGGGCCAAAT 5400 

GATGATATCA CACGTTTTGA GTATATCAAA CGAGCTAGCA AGGGAACAGG CCCAGTATTA 5460 

ATTGG6ATCG CTTTAGCAGA TAAGCATGAT TATGCAGGTT TGATTCGTAG AATGGAAGGT 5520 

TTTGATCCAG CTTATATTAA CTTAAATGGT AATGAAACGC TTTATAATAT GCTTGTCTGA 5580 

GGACTAATAA AAAAATATCA TACCTTCATT TTGATTTCCT ATCTATTGAC AAGCATAGTC 5640 

ACACTGTCTT TAATACTCTT CGAAAATCTC TTCAAACCAC GTTAGCTCTA TCTGCAACCT 5700 

CAAAACAGTG TTTTGAGCAA CTTGCGGCTA GCTTCCTAGT TTGCTCTTTG ATTTTCATTG 5760 

AGTATAAGGT ATGATTTGAT TTCTTTTTGT TGACAAATAT ACTATATTAA AAAGATATAT 5820 

AAGTAATTAA CTGAGCTTAT CTGTCTTGTC ATCTCTATTA AOGATGGTTT AGATAATCGG 58 BO 

6TGTCTGCTT CTAGGCTAGC ACCTCAATAT CCAAAGGAGT GATGAATTTG AAGGACATAA 5940 

GGAATACCTA TCTCTCAGAT 6ATTTATTGA GGAAGAAAGA TAGGAGTTTT TGAGCTAGTG 6000 

AAGGCTTGGA TTTCTAAAGG TTAGAACTAT CATCTTCAGT TCTTAAATCG AAGAAATAAG 6060 

CTATCTTACG GAAATAGAGA AGCATTTTTT AAGAACTTGA ATAATTTCGC ACCTTAAGAG 6120 

GGTAATAATA CAGTATTTTT ATTAGCAAAT ATTTATGGTG TAGAGGCTAG CAAAACCTAT 6180 

ATATTATCGG ATTTAAAAAG GAAGTAAGAA A .6211 
<2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 7939 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CCGGACTCCC CACGATTCTT CAAAATAACT GAGTATATTT CTATCTTGAT TTTCAGATAT 60 

AAATTCTTCC TTCTGTGGCC TCTTCTTACG CTTGAGAAGA GCTTCTCCGA CATGGCTTCT 120 

TCCTTACTGA GCAAAACCTT GAGCATAGAT AAGTTTGACT GGCAAGCGTG CTCTTGTATA 180 

TTTGGCTCCC TTCCCACTAT TGTGGATAGC GAGGCGTCTT CTCATATCAG TCGTATAGCC 240 

TATATAGTAG GATCCATCAC GACACTCCAG AACGTACATA TAAGCCTTAT GATCCATAAT 300 

AAATCTCTTC GATTTC6GGC GTATAAGAGC CATCATCATT GTGGACAATC AAAGGAGGTA 360 
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AGACCTTAAA 


GCCACTTGTT GAGCCATCCT 


TGATCGCXrrC 


AATCAAAAGC ATATTGGCTT 


420 


CCTTTTCTCT TTTTGGATAA ACAAACTGCA GGCGCTTAGG GGCTAGATTA TGTCGTTTTA 


480 


ACGTATCCAA AATATCCAGA AGTCGATCAG 


GACGATGAAC 


CATGGCTAAA CGCCCATTAG 


540 


ACTTGAGAAT 


ACTCTGGGCA CTACGACAGA 


TTTCTTCCAA 


ATTAGTCGTG ATTTCGTGTC 


600 


GAGCCAAGAG 


ATAATGTTCA CTCTCGTTCA 


GATTAGAATA 


AGGATTCACC TTGAAATAGG 


660 


GTGGATTACA 


CAAAATCATA TCCACCTTAC 


TCCCCTGAAT 


GTGAGCAGGC ATATTTTTCA 


720 


AATCATCGCA GATGACCTGC ATTTGCTCCT 


CTAATCCATT 


CAAACG6ACA GA6CGTTCAG 


780 


CCATATCCGC 


CAAACGCTCC TGAATCTCAA 


CAGACAATAT 


CTGTGCTTGA GTACGAGTGC 


840 


TAGCAAAAAG 


CCCCACTGCT CCATTCCCAG 


CACAGAAATC 


CACAATCAAC CCCTTCTTAG 


900 


GAAAACGTGG AAATCGTGAT AAGAGAACAC 


TATCCACCGA 


ATAGCTAAAA ACCTCTCTAT 


960 


TTTGAATGAT 


TTTGATATCT GTCGAAAAGA 


GCTGGTTAAT 


GCGCTCTCCT GATTTTAATA 


1020 


ATTGTTCTTC 


TTCCATGGTC CTATTATAGC 


AAATTCATAT 


TAACATTACA AAAAATATAA 


1080 


AACTCTAAAC 


TACTTCTTCT TTTTTAAATG 


GTGCAGGGCT 


TCTCCAGTCC AGATTGGTAG 


1140 


CATTCGTCGA 


AAGGGAGCAA AGCCGTAGTT 


AAAGCGGTCG 


CTTGAAAAGC GTCTCCGTCT 


1200 


AGGAAACTGG 


TACTTTTCTT CCTCCAAAGT 


GCGGATAGAA 


AGACTGGCTT TCCCTGTAAA 


1260 


TTCATCTAAA 


TCCACTACCT GAACTTGAAC 


CTCTTCATCG 


ACTTTCAAGG TTTCATGAAT 


1320 


ATTTTCAATA 


AATCCTGTCC GAATCTCTGA AATGTGAATC 


AGCCCCGTAT CACCCGTCTC 


1380 


TAACTCAACA 


AAGGCACCGT AGGGCTGAAT 


CCCTGTAATA CGCCCCTTTA GCTTATCACC 


1440 


GATTTTCATC 


TTAGTCCTCG ATTTCAATAG 


TTTCAATTAC AACATCTTCA ACTGGCTTGT 


1500 


CCATAGCTCC 


TGTCTCAACA GCAGCAATGG 


CATCCAAGAC AGCGTAAGAT GCTTCATCAG 


1560 


CTAACTGACC 


AAAAACCGTG TGACGGCGGT 


CTAGGTGAGG 


TOTCCCACCT TGATTGGCAT 


1620 


AGATTTCTGC 


AATCGGTTCT GGCCAACCAC 


CACGAGTAAT 


TTCTTTCTTA GAATAAGGTA 


1680 


GGTGTTGGTT 


TTGCACGATA AAGAACTGGC 


TGCCGTTGGT 


ATTTGGACCA GCATTTGCCA 


1740 


TGGAAAGAGC 


ACCACGGATA TTGTAAAGCT 


CTTCTGAGAA 


TTCATCCTCA AAAGATTCGC 


1800 


OGTAGATTGA CTCGCCACCC ATACCAGTTC CAGTTGGGTC TCCACCTTGG ATCATAAAGT 


1860 


CCTTGATAAT ACGGTGGAAA ATGACACCAT CATAGTAGCC ATCTTTTGAA AGAGATACAA 


1920 


AGTTAGCCAC TGTTTTAGGA GCATGTTCAG GGAAAAGCTT GATACGTAAG TCTCCGTGAT 


1980 


TGGTCTTAAT AGTCGCAAGA GGACCTTCTA 


CTGTTTCAAT 


GTCTACTTGT GGAAAATGCA 


2040 


ATTCTTTTTC TACCATACCA AATACTTCTA AGGCAGCAAA AATGCCATCT TCTTCTAATG 


2100 


TTTTTGTAAT ATAATCTGCT TTTTCTTTGA TTTTATCATG AGAAATTCCC ATGGCAACGC 


2160 



I 
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TGATTCCAGC ATAATCAAAG AGTTCCAAGT CGTTGAGACC ATCTCCAAAA ACCATGACCT 2220 

TCTCTGGTTT CAAGCCAAGG TGTTCCACAA CCTTTTCCAC CCCCGTCGCT TTGGAGCCTG 2280 

AAATCGGCAC AATATCAGAC GAATGTTGAT GCCAACGAAC CATGCGAAGT TTGTCTGAGA 2340 

GACTGTCAGG CAAGTGCAAG TCATCTCCCT TATCTTCAAft AGTCCACATC TGATAGATAT 2400 

CTTCTTTTTC ATGGAAATCG GGATCTACAT CTAAGTCGGG ATAAATTGGA TTGATAGCTT 2460 

CACTCATCAT ATCGGTGCGA GTCGACAACT TGGCATCATG ACTCCCAACC AAGCCATACT 2520 

CAATTCCTTC TTGCTTAGCC CAAGAGATAT ACTCCTpAAC ATCTGACTTT TCAATCTGAT 2580 

GCTGATAAAT GACCTGACCT TTTTTATCTT CGATATAAGC CCCATTCAAA GTTACAAAAA 2640 

AGTCAGGCTT GAGATCACGA ATCTCTGGAA CAACACCAAA AATGCCACGT CCAGAG6CGA 2700 

TTCCTGTTAA AATTCCTTTT TCACGCAACT GTTTAAAAAC AGTGGGAATT GTAGTTGGAA 2760 

TAAACCCTGT CTTTGAATTC CGCAATGTAT CATCAATATC AAAAAAGACA ATCTTGATCT 2820 

TCTTTGCCTT GTATCTTAAT TTCGCGTCCA TCTCACTACC TCTTTCAATC TAACTCTTTC 2880 

CATTATATCA TAAAGTAGGC AAATCCCCTA TTTTCAAAAA GTTTATCATT TTTATTTTAA 2940 

TTTCTTGGAT GAGAAAAGAG ACATATTTAT GAAAAAGCTC CATCGTGCTT TTAATGTGTT 3000 

CrCTTGTTTT CAAACTCGTA AAAAGGGAGC CACTGATCCT AACTCGCTCT CTCATTTCAA 3060 

AGCTTGTGAA AAAAGACCCG TTGGGGTCTT AATTCGCTTT CTTGTTTTCA AGCTCATGAA 3120 

AAAGAGACCC AACTGGGTCT TTTCTTTAAT CTTCGTTTAC GAAAGGCATC AAAGCCATTA 3180 

CGCGAGCGCG TTTGATAGCT GTTGTTACTT TACGTTGGTT TTTAGCTGAA GTTCCTGTTA 3240 

CACGACGAG6 AAGGATTTTC CCACGTTCTG AAACGAAACG GCTAAGAAGC TCAGTATCTT 3300 

TGTAATCAAC ATATTCAATT TTGTTTGCTG CGATGTAATC AACTTTTTTA CGGCGTTTGA 3360 

ATCCGCCACG ACGTTGTTGA GCCATGTTTT TTCTCCTTTA TAAGTTTAGT TGTCCATTAG 3420 

AATGGTAAAT CATCATCTfSA AATATCCAAT GGGTTTGTTG CTCCAAATGG ATTTTCATTA 3480 

CGTGAAAAGT CTGGTACTGA ATTTGTAGGT GCTGAATAGT TTGCAGTTGG TGCAGAGTAA 3540 

GCTCCACCTG TGTGACCCTC ACGCACACTA CGGCTTTCCA ACATTTGGAA ATTCTCAGCC 3600 

ACGACCTCTG TCACGTAGAC ACGTTGTCCT TGCTGGTTAT CGTAACTACG AGTCTGGATA 3660 

CGACCTGTCA CCCCGATAAG TGAGCCTTTT TTAGCCCAGT TAGCAAGATT TTCAGCCTGT 3720 
TGGCGCCACA TAACGACATT GATAAAATCA GCCTCACX3TT CACCATTTTG ACTCTTAAAT ' 3780 

GTACGGTTTA CTGCAA6AGT AAAAGTCGCA ACTGCTACAT TTGATGGGGT ATAACGCAAC 3840 

TCA6CGTCAC GTGTCATACG CCCTACAAGT ACAACATTGT TAATCATAGT TTACCTTCTT 3900 
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ACGCGTCAAT TTTGACGATC ATGTGACGAA GAATGTCAGC GTTGATTTTT GAAAGACGGT 3960 

CAAACTCTTT AAGAGCTGCA TCGTCATTTG CTTCAACGTT AACGATGTGG TAAAGTCCTT 4020 

CACGGAAATC TTGGATTTCG TATGCAAGAC GACGTTTTTC CCAAGTTTTT GATTCAACAA 4080 

CAGTTGCACC GTTGTCAGTC AAAATAGAGT CAAAACGTGC TACCAAAGCG TTTTTAGCTT 4140 

CTTCTTCAAT GTTTGGACGA ATGATATAAA GAATTTCGTA TTTAGCCATT GATATGTTCC 4200 

TCCTTTTGGT CTAATGACCC CAAGACTTTG CAAGGGGTAA GTGAGGTTCG CTCACAATAA 42 60 

ACTATTATAC TAGAAAAAAT TTTTTTACGC AAGTAAAAAC ACTAGAATTC GAAAAAACGC 4320 

CACATGGGCG TTTTCCTGTT CTTATGQTTT GATACGGTGC AACATACGTG GGAATGGAAT 4380 

AGCTTCACGG ATATGTTTTG TTCCTGCTGC GAAGGTTACC ATACGTTCGA TACCGATACC 4440 

AAATCCTCCG TGTGGAACTG TACCGTATTT ACGAAGGTCA AGGTAGAATT CATATTCTGT 4500 

ACGATCCATG CCAAGTTCAT CCATCTTAGC GACAAGGGCA TCGTAATCTT CCTCACGCAT 4560 

AGACCCACCG ATAATTTCTC CATAGCCTTC TGGAGCAAGC AAGTCTGCAC AAAGCACGCG 4620 

CTCTGGATTT CCAGGAACTG GTTTCATGTA GAAGGCCTTG ATGGCTGCTG GATAGTTCAT 4680 

GACAAATGTT GGCACACCAA AGTGGTTTGA AATCCAAGTT TCGTGTGGTG ACCCAAAGTC 4740 

ATCACCATGC TCAAGATGCT CGTAGTCAGC ATCTTCATCA TTTTCATGCT CTTGCAAGAG 4800 

GTCAATGGCT TGATCOTAAG TGATACGTTT 6AATGGCTCT GCAATGTAGC GTTTCAAGAG 4860 

TTCTGTATCA CGTTCCAAGG TTTCCAAGGC TTGAGGCGCG CGGTCAAGAA CACCTTGTAG 4920 

AAGAGCTTTC ACATAAGCTT CTTGCAAGTC AAGCGACTCA TCATGTGTCA AGTATGAGTA 4980 

CTCAGCATCC ATCATCCAGA ACTCAGTCAA GTGACGGCGT GTTTTTGATT TTTCAGCACG 5040 

GAAAACTGQA CCAAAGTCAA AGACACGACC AAGAGCCATA GCCCCTGCTT CTAGGTAAAG 5100 

CTGACCTGAT TGGCTCAAGT AGGCTGGCGT TCCGAAGTAG TCAGTTTCAA AGAGTTCTGT 5160 

AGAATCTTCT GCCGCATTTC CTGAAAGAAT TGGGCTGTCA AACTTCATAA AACCGTTCTT 5220 

GTCAAAGAAC TCATAAGTTG CATAGATAAT AGCGTTACGG ATTTGCAACA CAGCTACTTG 5280 

CTTACGAGAG CGTAgCCACA AGTGACGGTT ATCCATCAAA AAGTCTGTTC CGTGTTCTTT 5340 

TGGTGTGATT QGGTAGTCTT GAGATTCACC GATCACTTCG ATGTCTGTGA TGTCCAACTC 5400 

ATAGCCAAAT TTAGAACGTT CGTCCTCTTT GACAATACCT GTCACATAAA CAGACGTTTC 5460 

TTGGCTCAAG CGTTTGATAA CATCAAACTT CTCAAGTCCC ACTTCTTCAC CAAATTTTTC 5520 

GACAAAGTTT GGTTTAAAAG CCACACCTTG AAAGAAGGCT GTTCCATCAC GCAATTGTAA 5580 

GAAAGCGATT TTTCCTTTTC CTGATTTGTT GGCAACCCAA GCGCCAATCG TCACTTCCTG 5640 

ACCAACATA6 TCTTTTACGT CAATAATCGT TACACGTTTT GTCATTATTT TTCCTTTTCT 5700 
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TTTTTATTCT 


TTATGGCAAA 


CCACCTCTAT ATTGTTCCCA 


TCCAGGTCAA 


TCATAAAAGC 


5760 


AGCATAGTAA 


ATCGGATGCT 


CACTTCGATA ACCAGGAGCC 


CCATTGTCTC 


GCCCACCTGC 


5820 


CTCTAAGCCA 


GCCTCATAAC 


AAGCCTGAAC TTCTTCCTTA 


TTTTCTGCTA 


AAAAAGCAAA 


5880 


ATGAACA66A 


TCTTGTGTTC 


CCTGAGTCAG CCAAAAATCA 


CCACCAGGAT 


GAGGGCTGTT 


5940 


CGGGGATAGA 


AAACTAATTA 


GAGAACTAGT CTTAAAAGCC 


AATTTATAGT 


CCAAAGGAGC 


6000 


GAGAAAACTC 


CTATAAAATC 


CTTATGAAAT TTGTAAATCC 


TTTACCTTAA 


TCTCAAAATG 


6060 


ATCAATCATT 


CTCACTACCC 


ATAAATGCTT TCAAGCGTTC 


GACTGCTTCT 


TTAAGCGTGT 


6120 


CTAGGTCTGT 


CGCATAGCTG 


AGGCGGACAT TTTCTGGTGC 


TCCAAATCCA 


GCTCCTGTTA 


6180 


CCAAGGCCAC 


TTCGGCTTCT 


TCTAAGATAA CAGTTGTAAA 


GTCTGTCACA 


TCCGTGTAGC 


6240 


CTTTCATCTC 


CATGGCCTTT 


TTGACATTTG GGAAGAGATA GAAGGCCCCT 


TGCGGTTTGA 


6300 


CCACTTCAAA 


TCCTGGTACC 


TCTGCAAGGA GGGGATAGAT 


GGTATTAAGA 


CGTTCCTCAA 


6360 


AGGCCTGACG 


CATGCTTTCT 


ACAGTATCTT GCTCACCTGA 


TAGAGCCTCA 


ACTGCTGCAT 


6420 


ATTGGGCTAC 


TGCTGACGGA 


TTCGAAGTTG TTTGACCTGC 


AATCTTGGAC 


ATGGCAGCGA 


6480 


TAATGTCTGC 


TTCTCCAACG 


GCATAACCAA TCCGCCAACC 


AGTCATGGCA 


TAAGTTTTAG 


6540 


ACACACCATT 


GATGACCACT 


GTTTGCTTGC GAATCGCTTC 


CGATAGGCTA 


GAAATCXSGTG 


6600 


TGAACTCATG 


ACCATTATAA 


ACCAAGCGGC CATAGATATC 


GTCTGCTAGG 


ATGAGAATAT 


6660 


CATTTTCTAC 


AGCCCAGTTT 


CCAATTGCCA AGAGTTCCTC 


ACGGGTGTAA 


ATCATACCTG 


6720 


TGGGATTAGA 


TGGCGAATTC 


AGCACCAAAA CCTTGGTCTT GTCAGTGCGA GCTGCTTCTA 


6780 


ACTGCTCTAC 


GGTCACCTTA 


AAGTGATTGT CTTCCTTAGC AGAAACAAAG 


ACGGGAACGC 


6840 


CTTCTGCCAT 


CTTGACCTGA TCTCCATAGC TAACCCAGTA TGGGGTTGGG AT6ATGACTT 


6900 


LA 1 l-ALv. I\3t> 


ATTGACCACA 


GCCATAAAGA AGGTATAGAG 


AGAATATTTG 


GCTCCCGCAG 


6960 


CGACTGTCAC 


TTGATTTGAC 


GCTACAGAAT AGCCGTAAAA 


GCGCTCAAAG 


TAGCTATTGA 


7020 


CCGCCGCCTT 


AAGCTCTGGC 


AGACCTGAGG TTACTGTATA AAAAGAAGCA 


CX5CCCATCTC 


7080 


GAATCGATGC 


AATGGCGGCA 


TCTTGGATAT TTTTGGGAGT 


AGTGAAATCT 


GGCTCACCCA 


7140 


AGGTTAGAGA 


CAAAATATCT 


CTACCCTCAG CCTTCAGTGC 


TTTGGCACGG 


GCTCCAGCAG 


7200 


CCAAAGTCAC 


ACTTTCTTCC 


ATTTCTAAAA CACGGTTGGA 


TAGTTTCATA 


GGCCCTCCTT 


7260 


GTT6ACCAAT 


GCTCCTGTTT 


CAAAATCTAC TAGATAAAAA 


TCAGATCCTG 


ACTTAACTTC 


7320 


CCAGATTGGC 


TTATCTTGAT 


AACGGCCAAA GGTTATCTTG 


TCAATCTCGC 


CAGCTCCCTT 


7380 


TTCCTTAGAA 


ACCGTTTCTG 


Cm-lWrTG TGAAACACCC TGATTTAGCT GATAAACGTA 


7440. 
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AATCTTATGG TCATCTTTAC CAATCAGGAC AGCAAGCGCT TCTTGCTGTT TGTTACGACC 7500 

AAGAACGCTG TAATAAGATT CCAAGCCATT GTATAAATCA ACCTGATCAG CCTGCTCTAA 7560 

TCCTGCATAC TGCTGAGCTA ATTTTTCTCC TTCACTTTTA GCTGTTTGAT AGGGTTTCAT 7620 

GCTAAGAGAA ACCATATACA GAAAGGAACC ACTGATAACC ACAAACAAAA TCGTCATCCC 7680 

TAGACCATAC TGCCACAGTA GATTATTTTT TGCTTTGTTT TGTCTTTTTT TCACTCGTCT 7740 

ATTTTACCAT CTATTAAGCT TTATTACAAG TGAATATAAG AATACTCTTC GAAAATCTCT 7800 

TCAAACCACG TCAGCTTTAT CTGCAGACCT CAAAGCTGTG CTTTGAGCAA CCAATTCTAT 7860 

TTCTCCCTTC AAACAAAACC GATTTTGAAA GTGAAACAGT TCTTACTTTT TCAGTCACAA 7920 

ATGATTAGAG TTTGCCGGG 7939 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 9897 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: double 
(O) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



CCGCTCTACC 


GTCAAATAAT 


TACCATTTTG 


TTTAATACCG 


AAATTTTTAT 


CTACTGAAAA 


60 


TTCAGTTGGT 


CTGTTGGTAC 


GATCGTCGTA 


TACAGTACCA 


TTCTCACGAA TAGTATAATT 


120 


GTAATCAGTA 


TCACCTTGTT 


TCCTTAATTT 


AAGGTAATAA 


TTACCATCAA 


TTTGTTTATA 


180 


ACCTGAATCT 


TTTCTAGTTG 


CTTCTCTAAA 


ACTTACTCCA 


GCAGGCATCA 


CATCAGCAAA 


240 


CATGAGTACT 


TGTTTGTTCT 


TTTTTTCAAC 


AATAACAGAG 


TCAATATAGG 


TTGCACCACC 


300 


GCTGATTTGT 


AAGTCACGTC 


CACCAACTTC 


ACGAGGCCAT 


TCTAATGGTA CTGGCGCAAA 


360 


ATCATCGAAT 


GCCAATGTTA 


ATTTTGGTTT 


AGTCCATGTC 


TTACCATTAT 


CATCACTATA 


420 


ACTTGTAGCA 


ATATTAATTT 


TATTCAAGAA 


ATCATGAGTT 


CCACCGTAAC 


GAGCGTCAAT 


480 


GCTTGAAAAT 


ACCCGACCAT 


TGCTAAAAGT 


ATACAGAACT 


GGAATACGGA 


AATAGTTAGA 


540 


ACCTGTTGTA 


TCATTAGCCG 


TATAAATTAA 


ATGTCCAGTA 


ACAGCGTTTG 


TTGTCATCTT 


600 


TTTAACAGTT 


TCTTCATCCA 


ATGCACTATT 


AAAGAATTTG 


ATATTTTCTA GTGTTCCGTT 


660 


AAAACCAAAC 


GCCGTTTTTC 


CTGCACGTTT 


CACTCCCCCA 


AGCATATAGT 


AATCAATACC 


720 


TTTAATATCC 


TTGATGTTTA 


GGAAATTATC 


CACTTTCTTT 


TCTACTACTT 


TTGTACCATT 


780 


TGCGTATAAA 


GAATATGTTT 


TTTTGACTGA 


ATCTGCTACT 


ACTGCAACAG 


TGTTAGTCAC 


840 


AGCCTCTTGT 


TTGTACTTAC 


CCCAAACTGA 


AGCAGGTCTG 


GATACTAGGT 


TATTTTTATT 


900 
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GGAAGAAGTA TCACGCGCTT CCATCCCCAA CTCACCATTG TCTCTAAGGA ACACATCTAC 960 

ATAACTATTT TGTTGACCGG GTTTGGAATT AGATATTCCA AACAGAGCTT GTAAGCCTTT 1020 

CTCACTTGAC TGATTGTACT TAATCACTAC AGTAAAGTCA CCGCTAGTAA ATTTATCCTT 1080 

TAACTCTTTA GTAACATTTT CTCCGCCCCC TGTTAAA6TA ACATTATTTT TTTCTAAGAC 1140 

AGGAGTTTCT TCCGCTGTAG AAGATGGATC CTTAACAGTA GTTTCAACTG TTCGAGGTTG 1200 

TACAGTAACT TCCGAAGAGT TATCCGATGT AGGTTGTACT TCCGAAATCG GAGTCGTTGG 1260 

TGCAACAGGT TGCACCAACT TTGGTGTTGA TACTTCAGAA GTTTCAGTCT CCTGAGCTGC 1320 

AACTGAGTTA GCAACAAATG CTGATAATAC CACTACAGTA CCTAAGGTTA CATATTGTTT 1380 

AATATTTTTT TTCATTTTAT TTTTCCTCGT TTAAAACTTT GATAACAAGT TTTTTAACAG 1440 

TTTCATCATT GCAATGAATC TTTGGTTGGT GAAGATCTTC TTCAAAAGTC ACCAACATAT 1500 

TCCCTGGAAG CAATTCAACA ATTTGATA6T CTTTGCTATC GTAAAAAGCA ATATCCTTCT 1560 

CTTCGCTAAA AGGTACACGT GACTGGGCAC GAACTGGGGA AGTTACTGCC ATTTTTTCAG 1620 

TATTTTCAAC AACAATATGA ATATCTAAAT ATTTCTTATG AGTTTCAAAA ATATCTCCTG 1680 

GAACTCCATC AGCTAGATAA GTCATACAAT TTGCAAAAAC ATTTTCCCCG TCAATATCAA 1740 

TTTTTCCATC AACTAAATCT GTCAAATTTG TATTTTCTAA AAAATCACAG ACTTTTGAAA 1800 

AATATTTATT GACAGAAGCA TATCGTTTAA AATCAGATTG TTCAGAAATA ATCATATTAT 1860 

TTTCTCTTTT CTATTAGTGA CGAACTTCCC AACTTGAATC CGCTTTAATT TCTGTAATAT 1920 

CATGAATCGT TGTATATTTA GGTGCAGATA CTTTATTTCC AGTAAGAACA GATACAATAT 1980 

AACCTGAAAC TACTGATACA GAGATTGAAA TCAATGAATA TGCCCAGTAG CTAACAGCTG 2040 

TTGGAGGAAG GAA6TATTTA ATAAATACCA TGACGATGGT TGATACAATC AGCGCTGCAT 2100 

AAGCACXTTTG TTTATTTGCT TTTTTAGAAA CAAATCCAAG AATAAATACA CCACCAAGTA 2160 

GACCAA6TAC AAGTCCCATG AAACTATTGA ACCATTCGTA TGCAGATTTA ATATCTGAGT 2220 

GAGCCATGAC AATGGAAACA CCAATTGAGA ATAAACCTAC TGCTAGAGAT ACGAATTGOXS 2280 

CAATTTTCGT ACGACGATTG TCTGACATAT TTTTAGAAAT GACATCTTGA ATATCCAATG 2340 

TCCATGAAGT TGCAACAGAG TTCAAACCTG TTGAAATAGT TGATTGAGAT GCTGCATAAA 2400 

TCGCTGCCAA GATCAAACCT GTGATACCTA CTGGTAACTG GTATGCAATA AAGTACATAA 2460 

AGATTTGGTC TT6AGG6ATA TTGCTAGCTG CACTATCTGC ATTTTGTACT TGATAGAATA 2520 

CGTACAAGCC TGTACCAATC AAGTAAAAGA CTGTTGCAGT TGCAAGTGAC AAAACACCGT 2580 

TTGTGAACAA CATCTTATTA AGTTTCTTAA TATTTTGTGT TGTAGTAAAA CGTTGAACCA 2640 
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AATCTTGAGA TGAAGCATAG GAAGACAAGA TTGTAAAGCC TGAACCCATC ACAATTAAAA 2700 

AGATGGAGTT TGAAAGCAAG TTAGGATCGA AAAGTTTTTC ATTTGCAGCA A6GAATTTCC 2760 

CGTTTGCTAA TGTTTCTGCT ACTGCACCAA AGCCACCTTT AATATTAGCA ATCAGTACAA 2820 

ATAAAGCTAA AACGACACCA CTAATCAGAA TCACACCTTG AATAAAGTCT GTCCATAATA 2880 

CGGATTTTAG ACCACCAGTA TAAGAATAAA CAATTGCAAC TACACCCATC AAAATAATCA 2940 

AAATATTGAT GTCAATTCCT GTCAATACTG ATAAACCAGC TGATGGGAGG TACATAATGA 3000 

TAGACATACG TCCCAATTGA TAAATAATAA ACAAGAGTGC TGAAATAATA CGAAGTGCTT 3060 

TAGAATTAAA ACGTTTATCC AAGTAATCAT ATGCCGTATC GATGTCTATC CGTGCAAAGA 3120- 

TAGGTAAGAT AAAACGAATT GTCAGTGGAA TAGCTACTAC CATCCCTAAT TGAGCAAACC 3180 

ATAAAATCCA GCTACCTGCA TAAGAGCTAC CAGCGAGTCC CAAGAAGGAA ATCGGACTGA 3240 

GCATTGTGGC AAAAATGGAT ACCGAAGTAA CATACCAAGG AACCGAACCA TCTCCTTTAA 3300 

AGAACTCTTT TCCTTTCATC TCTTTTTTAG AGAAATAGAT ACCTGCAACC AACACCGCAA 3360 

GTAAATAAAC AATCAAGATA ATTAAGTCAA TTATTGTAAA TCCTGTTGTG CCCATAACAT 3420 

ATCTCCATAT TGATTTTATT TATTATAAAA ATTCTTTTCG TGCTTGTTGA ATAAGTTCTG 3480 

CTGCTTGTTT TGCAACTTCC AAGTCACCTT CTGCCAATGC TTCTAAAGGT TGACGAACAG 3540 

AACCTAAATC AAGTTTTTCA TTTAGACGCA AAACTTCTTT TGCTACAGCA TACATATTTG 3600 

CCTTACCTGA TATCATCTTA TAGATAACTT CATTGATAGC ATATTGAAGT TTTTTAGCTG 3660 

TATCTAAATC TC6TTCTTGA ATCAAACTTT CCAATTTCAA GAACAAATCT GGCATAACGC 3720 

CATAAGTACC ACCAATACCA GCTTCTGCTC CCATCAAGCG ACCACCAAGA TATTGTTCAT 3780 

CTGGACCATT GAATACAATG TAATCTTCTC CACCTGCAGC TACAAACATT TGAATATCTT 3840 

GTACAGGCAT AGAAGAATTT TTAACTCCAA TCACACGAGG ATTTTGACGC ATT6TTGCAT 3900 

ACAAACTACC AGTCAACGCA ACCCCTGCCA ATTGTGGAAT ATTATAGATA ATAAAATCTG 3960 

TATTTGACGC AGCTTCACTC ATTGCATTCC AATATGCTGC GATTGAATAC TCTGGCAATT 4020 

TGAAATAAAT AGGTGGGATA GCTGCAATAG CATCGACTCC AACACTTTCT GAATGTTTTG 4080 

CCAATTCGAT ACTATCTTTC GTGTTATTAC ATGCAATATG GTTGATAACT GTTAATTTAC 4140 

CTTTAGCAAC TTCCATAACA GCTTCAATAA TTTGTTTACG ATCTTCTACA CTTTGGTAAA 4200 

TACATTCACC TGAAGAACCA TTTACATAGA TACCTTTTAC ACCTTTGTCA ATGAAATATT 4260 

GTACCAGAGA TTTTACACGA TCTTGGCTAA TTTCACCATT TTCATCATAG CAAGCATAAA 4320 

ATGCAGGGAT AACGCCTTTG TATTTAGTTA AATCTTTCAT CAGATTTCTC CTTTATATTG 4380 

TTTTTTATTT GATGACATTA ATAAATCGCT GAGCAATTTC TTTTGGACGT GTAATCGCTC 4440 
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CACCAATGAC TACACTGGTA ACACCTAAAC 


TATAAGCTTT 


TTTTAATTGT 


TCTGGATAAT 


4500 


GAATTTTTCt 


TCGGCAATTA CCGGAATATT 


AAAATCAGCC 


AATTTTTTCA 


TTAGTTCAAA 


4560 


ATCAGGCTCA TCTGATTGTA CACTTGTACT 


TGTGTAACCT 


GATAATGTTG 


TACCAACAAA 


4620 


ATCAACGCCT GATTTAAATG 


CATAGAGACC 


TTCATCTAAA 


TTACTTACAT 


CCGCCATCAG 


4680 


CAATTGATTC 


GGATATTTTT 


CTTTTATTTT 


TTTGATAAAT 


TCACTGACAA 


CTAAGCCATC 


4740 


ATATCTTGGT 


CTTAAAGTTG 


CATCAAATGC 


AATGACTGTT 


GTTCCGCATT 


CTACAAGTTC 


4800 


ATCTACTTCT 


TTCATCGTAG 


CAGTAATATA 


TGGTTCTTGA 


GGTGGATAAT 


CCCTTTTGAT 


4860 


AATTCCAATT ATTGGTAAAT 


CTACTACTTT 


CTGAATTGCT 


TTAATATCAC 


GCACAGAATT 


4920 


TGCGCGAATG 


CCCACTGCTC 


CTGCCTCTAA 


AGCTGCTTTA 


GCCATAAAA6 


GCATCAAGCT 


4980 


AAATTCTTCA 


TTATAAAGGG 


CTTCACCAGG 


TAAAGCTTGA 


CAAGAAACAA 


TGACTCCACC 


5040 


TTGAACTTGG 


CTTATAAATT 


TTTCTTTAGT 


CCAAATTTGG 


CTCATTTTAT 


TATTCCTCCT 


5100 


TATGGATAAT 


AGTTTGATTG 


TAATAATATT 


GTCTCTCTGG 


ACTTTCCAGA 


TAATTAGAGA 


5160 


ATAAGCAGTC 


TGTAATTAAA 


AGTATTGGAA 


ACTGAGGTGA 


TATGCGATTG 


CCATACGAGA 


5220 


GATGATCGGT 


CGAAGCTAAT 


AACAATAGTT 


CATCAAAGAA 


ACAATCTTCT 


TCGTCAAATT 


5280 


TTCTTGTAGT 


CATTAAAACT 


GTTTTAGCGC 


CTTTATCTGC 


AGCTTTTTGT 


AGACCTTCTA 


5340 


GTACAATATC 


AGTTTGACCT 


GAAATGGATG 


CTCCAATGAC 


AAGGCAATTT 


TCATTAAGTA 


5400 


GTAAGCTACT 


CCACAAAATC 


ATATCCTCGT 


CTGATAATAC 


TTCACCAATC 


ACTCCGAGAC 


5460 


GCATAAATCT 


CATCTTCATT 


TCTTGTAAAG 


CAAGAACAGA 


ACTTCCTTTA 


CCGTAGAGAT 


5520 


ATACACGCTC AGCAGTTTCT ATCATCTCAG 


CAATACGCTC 


AA6TTGAACT 


TCATCAAGAA 


5580 


CCGTGTAA6T 


TTTTCTCAAC 


ATTTCCTCAT 


AGTCGGATAA 


AACTTTTTCT 


GTTGCCTCTG 


5640 


TATATAATGC 


CAACTTTTCT 


TTCTCATGAA 


TCATCTCTTG 






5700 


TAAAACCTTT 


AAAACCACAT 


TTTTTCGCAA 


ATCGAGTCAA 


TGTTGCTTTG 


GATACATTAA 


5760 


GGTATTCGCA 


CAATGCTTTA 


GATGAATAAT 


CATTCAGAGG 


TTGCTGTTTT 


AAGAAGAATT 


5820 


TAGCAATGTC 


TTTTTCAGCA 


TATGCCATAT 


TTGGTAAGTT 


AGCTTCTATC 


ATTGGAATTA 


5880 


GTTCTTTTTG 


CAGTAACATA 


TGAGCTCCTT 


AGTTGAAGTA 


AACGTTTACA 


TTCTTTATTT 


5940 


TAACACTTTT 


TTTTTTTTTC 


AATATTTTTC 


ATAAATTAGA 


AACTAGTTTC 


CAATTTCTTT 


6000 


CGTTTCATAA 


CA6AACAACA AACATAAAAA 


TATAATAGTT 


TTTATTCTTT 


TTATCGTAAT 


6060 


TATATGTATT 


GTAAGAACGT 


TTATCACTAA 


TAATATGTTC 


ATATTAAAAT 


ATTTTAGTAA 


6120 


TATTTTATTT 


TGGTTTTATT 


ATTTCTTTTC 


GGAATTTCTA 


TATAATATTT 


TATTTCTAAA 


6180 
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AAAATTGAAA AAATATTTCT AGTTTCTTTA TTTTATATAG GTAATATATT TTATTTCTAA 6240 

ATTAAAAGAG AATCCCATAA AAACTACAGA TTTATGAGAT AAATCAGGTC ACCTATTTTA 6300 

AAAAAGCAGC AAACTATAAA CTAAAAAGTT CCACACCAAA TGTAACCCCA TACTTCCCCA 6360 

TAAGTCAGAT TTATAGCGCA CCATACCTAA AAACATTCCA AGTGAAACGT ACAGACACCA 6420 

AGCTAGAATG GTTCCTGGAT GATGTACTAA GGCAAATAAA ACACTTGTCA AAGCAACTCG 6480 

AATATCTAAT TTTCTAACCA AGTTCCATAA AATTTCACGA TACAGAAATT CTTCAACCAT 6540 

ACTCGCATTG ATTAAGAACA ATAAAAATGA AAACCAAGGA ACTTGATGTT GAAGGCCAAT 6600 

TAAATTTGTT TGATTCGTGC TTCCTTGAGC ATGAATCAGG CTAAAACATA GACTTATAAT 6660 

CAGTAGACTA GCTAGTCCAA TACCAAGGCA TTTCATCCTA GTTTTCATAT TGACCTTGAC 6720 

CACTTGTTTT CGTTGACCAT ACATCCATAA AAAAGAAAAA AGAGACGCAC CATAGAGAAC 6780 

CTGTAGTATA GTTAACTCAC CGATACAAAG AAATTTCAAT AAGTATAGAG ATACCAATAG 6840 

GACATTTACT TGTTGGAATA TATAAACTGG AATTATTCTT TTCATAGTTA CCTCCGAAAT 6900 

AAATCTTCAT AATCTAAATC TAATATCTGC ACAATCCTTT CTACCCATGG ACTTTGAGGC 6960 

ATTCGTTGTT CCATCTTGTA GTGGCGAATC TTTTGATATA AACGATTCAA TTCACTTGGA 7020 

TAGTG7U\ACT CTCCCGCAAA CATTTTTCTG GTTAACTCAA TCCAGCTGAT ATTTCTTTCA 7080 

GCCAAAATAA TGGACAAGTT CTCCCAAAAT CGTTCAGCCA TATTrCTTCT CCTTTAGTTA 7140 

GATAAATAAT GTGTTTGyGC CATGTAAATC AATTGTTTCG TATCTCTTGG CAATAGAGCT 7200 

CTAGCCTCTT CCAAATTCAG ACTTGGATAA ACCCGCTTAT TTGAAACCAC AAAAGGAAGT 7260 

CCGATGGTTA GTTCAGGATT TTTTAAAATT ATCTCAACGA AATCCGTTAA TCTTAGATTG 7320 

TCACGGTPCT TAAATCGTAA TAAATTGGGA GATAAAAACT CAAAACAATC TGAAGAATAG 7380 

CTCATCATCT CAATTAATTT GTCCTTTGTC ATTTCAGAAA CTGAATGACA AGATACCTCA 7440 

ATGCCATAGT TTTGGAAGAA GTCTAAAAGA AGTTGATTTC TTTGGCTATT TTTACTTAGA 7500 

TAGAGATCAA TCATGGGAGA CCTCCAACAA ATTTGCTTCC ATTTGATATT CTGAGACGAT 7560 

TAAGGAATCT AACAACTTTG AGAAGTTAAT CGATTTCTTG TCTTCATCAT AAGCTTTTAC 7620 

AGTTACTTGG GTTGTAAGTA TCCCCTCTTT TCCCTCGGCT CGATAGTCTT GTCAATATAA 7680 

AACAAAAACA AGATTCTGAT TATCATCTAC AAAGGCATTA ACTCCGTTCT TTATATCCTG 7740 

ACTTTCAAGG AATTCCATAA CGTTTTGAAG ATAGGATTCA TAAAATAGTG GGTAATTATG 7800 

TTTTTTATGG TAATCATCTA AAAATGTTAC CTCAAACTCA CAT6GATAAT TGGGCATCAA 7860 

AAATATTTGT TCATCCAGCT GTTTGATTTC TGCATCATGT AATTCTGTTT CTAATTCATC 7920 

ACAATCTAGT ATTGATTCTT TATTTAATGC TTTTATCTTT TTCCTCTATT TCTTTTAATT 7980 
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TCTTTGCGAT TGCGGCAATC ACAGGAACGG TTACACTATT ACCAACTTGT TTATAGAGCT 8040 

GACTATTAAT AGAGACTTTT CTAGCAGCTT CAAAAGCCTA ATCAGGAAAG CCATGCAATC 8100 

GAAAACACTC TTTAGGAGTG ATTCGTCGTA TTCTCAAACG GTAAAATTGT CCATCTATTA 8160 

AAACACCAGC TACTTGGTAA ACTTGTTTAT CTTCTCCTTC ATAGCTAGCC ACTACTACTC 8220 

CCATTTGACC ACTAGTTGTT AACGTATTAG CTATACCTTT TCCAACTCTA CCACGACGAT 8280 

ACTGAGAACT TGGTCTTTCT AAATTGATTG AATCCCCAAT CTCTGCTTGA GCATATCCTT 8340 

TTTTCGTTGC TTCCCGTACT TTTAGAAATT GGATTGGTTC TGGAATTAGT ATTTTGGGGA 8400 

TTTTATCTCC TCCTTGCATC GTAGTCAGTG TTGGAGATAA GCCCTCACTT CCATAGACAC 8460 

GACCTGTCTC CTTAAAGCTA GTCGGTAAAT CTCCAACAAC GACAATGCCA TAACGATCCT 8520 

6AGTATTTAA AGTAAACATC GGCTCTTGAT TTTCCTTAAA GCGTCTCCCA TTTTGTCTCT 8580 

TGTCTAATCT ATCTGGTGTC ATACAAGGAA TCGCAACTTT AAATCCTTCT CCTTTACCAC 8640 

GAACTAAGGT TGGCGCAAGA CCTTCTGAAT AATAGACTTT ACCGCTCATT CCACTTCTTG 8700 

ATGGATTCAA ATTTCCTAGT GCTTTCAAAG TCTCAGAGTT AGTTGCTTGA CCTTCTCGTC 8760 

TGAAAGGAAA TAAGAGTCTG GTACCTTTCT TTCTAGAATG TCCGATAATA AACACCCTCT 8820 

CTCTGTTTTT GGGAACGCCA AAATCCTTAC TGTTAAGCAC CTGCCACTCA ACATCAAACC 8880 

CCAACTCATC AAGTGTGGTA AGTATTGTGG TGAACGTCCG TCCCTTATCG TGATTGAGTA 8940 

GGCCTTTAAC ATTTTCAAGA AAAAGAAAAC GTGGTTGGAT TTGTTTGGCC GCCCGAGCAA 9000 

TTTCAAAGAA CAAAGTTCCT CTAGTATCTT CAAATCCCAA TCGTCTTCCT GCGATTGAAA 9060 

ATGCTTGACA AGGGAATCCC CCACAGATGA CATCGACTTT CCCTCTAAGT TTTTTAAATT 9120 

CGTCATCTGA AACATCTCGT ATGTCATGAA ATTGTATTTC TCCTTCCGTT TGAAAAATGG 9180 

ACTTATAAGA TTTCCTAGCA AATTTATCAA TCTCACAAAA TCCCAAGCAC TCATGCCCTT 9240 

GAGCTTCCAT TCCCATCCTA AAGCCTCCTA TCCCAGCAAA TAAATCTAAA ACCCAAATCA 9300 

TTCATACCTC TCTCAACTAG ATGTAACTTA CAAAACCCCT GACCTCATGA GCCACTTTCT 9360 

TCCTCCTCAT GAGGTCAGTT TTACTTTCTG CTGTTCCAGT ATCGTTTTTC CTCGCTAGAT 9420 

TTCCTCAAAA GGGCAGACTC CTCCCTTGGT TCGTCACACX5 ATTTTTTCAT CTCGACTGTT 9480 

CTTTAATGCA TCATTAACGA CGCTTTTCTT CTAGGTGGTT CATAAGGAAC AGGAAGATTC 9540 

AGGTTGACTT TTCTAATCCT AGAATAAAGT GCTGAAAACA ATTCGGAATA GOCATAGAGA 9600 

CTAGACAATT TGAGGAGCTG CTTGCGTCCT GTTCGAACAC ATTTTCCTAC CACGTGAAGA 9660 

AAAAGATGGC GGAAGCGTTT GATTGTTAAA GTTTGGAAGT CACCTCCAGC TAGATGTTTG 9720 
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AGAAAAAGAT AGAGATTGTA GGCGATACAG CTCATCATCA TACGAACTCG TTTTTGATTA 9780 
AGGTTGAACT ATCCGTTTTA TCGCCAAAAA ATCCCTCCTT CATCTCCTTG ATGAAATTCT 9840 
CGGCTTGACC ACGTCCACGA TAAAGCTGAA ACTGGTCTTG GCTTGTTCCG GTACCGA 9897 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8148 base pairs . 

(B) TYPE: nucleic acid 

(C) STRANDEDNES5: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CCGTGGAACA AGCCAAGACC AGTTTCAGCT TTATCGTGGA CGTGGTCAAG CCGAGAATTT 60 

CATCAAGGAG ATGAAGGAGG GATTTTTTGG CGATAAAACG GATAGTTCAA CCTTAATCAA 120 

AAACGAAGTT CGTATGATGA TGAGCTGTAT CGCCTACAAT CTCTATCTTT TTCTCAAACA 180 

TCTAGCTGGA GGTGACTTCC AAACTTTAAC AATCAAACGC TTCCGCCATC TTTTTCTTCA 240 

CGTGGTAGGA AAATGTGTTC GAACAGGACG CAAGCAGCTC CTCAAATTGT CTAGTCTCTA 300 

TGCCTATTCC GAATTGTTTT CAGCACTTTA TTCTAGGATT AGAAAAGTCA ACCTGAATCT 360 

TCCTGTTCCT TATGAACCAC CTA6AAGAAA AGCGTCGTTA ATGATGCATT AAAGAACAGT 420 

CGAGATGAAA AAATC6TGT6 ACGAACCAAG GGAGGAGTCT GCCCTTTTGA GGAAATCTAG 480 

CGAGGAAAAA CGATACTGGA ACAGCA6AAA GTAAAACTGA CCTCATGAGG AG6AAGAAAG 540 

TGGCTCATGA GGTCAGGGGT TTTGTAAGTT ACATCTAGTT GAGAGAGGTA TGAATGATTT 600 

GGGTAAATAC AATGAGCTTG AAAGAAGTAG CAAACTCACC AAGCGCCAAT TCTTTGAGAA 660 

TCAGATGCTG GATTATACCA TCATTGCGCA TGAGAGTTTT GAAATCATCC GTCATTCTGT 720 

CTACCAGACA GATGATCGTG AAGTGGAAAA TGCTCTGGCT TTTGAAGTGA AAAATGATGA 780 

AACAGACAAG CTGATTCTGT TATTAAGCGA GGATATTGGT GTAGGTGAAA AATTGTGCCT 840 

CGTTGACGGA ACAAAAATGC GTGGAAAATG TTTAGTATAT GATAAAATAA ATGAGAGAAT 900 

GATTCGCTTG CAGTGCTAGA AATAGGCATT TTGAATAGTG AATATGTTAT AATAAGTATT 960 

AGTAGGAGGT GTTTTAGATT GGAGAAGAAA CTGACCATAA AAGACATTGC G6AAATGGCT 1020 

CAGACCTCGA AAACAACCGT GTCATTTTAC CTAAACGGGA AATATGAAAA AATGTCCCAA 1080 

GAGACACGTG AAAAGATTGA AAAAGTTATT CAT6AAACAA ATTACAAACC GAGCATTGTT 1140 

GCGCGTAGCT TAAACTCCAA ACGAACAAAA TTAATCGGTG TTTTGATTGG TGATATTACC 1200 

AACAGTTTCT CAAACCAAAT TGTTAAGGGA ATTGAGGATA TCGCCAGCCA GAATGGCTAC 1260 
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CAGGTAATGA TA6GAAATAG TAATTACAGC CAAGAGAGTG 


AGGACCGGTA 


TATTGAAAGC 


1320 


ATGCTTCTCT TGGGAGTAGA CGGCTTTATT ATTCAGCCGA 


CCTCTAATTT 


CCGAAAATAT 


1380 


TCTCGTATCA TCGATGAGAA AAAGAAGAAA ATGGTCTTTT 


TTGATAGTCA 


GCTCTATGAA 


1440 


CACCGGACTA GCTGGGTTAA AACCAATAAC TATGATGCCG 


TTTATGACAT 


GACCCAGTCC 


1500 


TGTATCGAAA AAGGTTATGA ACATTTTCTC TTGATTACAG 


CGGATACGAG 


TCGTTTGAGT 


1560 


ACTCGGATTG AGCGGGCAAG TGGTTTTGTG GATGCTTTAA 


CAGATGCTAA 


TATGCGTCAC 


1620 


GCCAGTCTAA CCATTGAAGA TAAGCATACG AATTTGGAAC 


AAATTAAGGA 


ATTTTTACAA 


1680 


AAAGAAATCG ATCCCGATGA AAAAACTCTG GTATTTATCC 


CTAACTGTTG 


GGCCCTACCT 


1740 


CTAGTCTTTA CCGTTATCAA AGAGTTGAAT TATAACTTGC 


CACAAGTTGG 


GTTGATTGGT 


1800 


TTTGACAATA CGGAGTGGAC TTGCTTTTCT TCTCCAAGTG TTTCGACGCT 


GGTTCAGCCC 


I860 


TCCTTTGAGG AAGGACAACA GGCTACAAAG ATTTTGATT6 


ACCAGATT6A 


AGGTCGCAAT 


1920 


CAAGAAGAAA GGCAACAAGT CrTGGATTGT AGTGTGAATT 


6GAAAGAGTC 


GACTTTCTAA 


1980 


AATGAAGGAA AATGACTTGC AATCTCTGTT AAGAAATAAA 


ATAATCXrCAC 


CTAGAACAAG 


2040 


CTAGGTGGGA TTATTTGCCT ATGAAATGAG AAATTATGGG 


AGCAAGCTCC 


TAAATCAACT 


2100 


GTTTTTGATC TACTTCTTTA ACTACTTGAT AAAAGTTATA 


GAAGTAGGCC 


AAACTTGAAA 


2160 


TGATGGTTAC GACTAGGAAT ATTGAAAATT TCCATTGGAC 


AGGGTTGGTT 


AAAAGTTGTG 


2220 


GAAAGGATAT GAG6AGAAAG AAGAGGGCTG CGTTGAGGAC 


AGGTATCCGT 


TTTGATTGTA 


2280 


TTTTCTCAAG TCCTTTATTG AGCGCAGGAA GAAAGAGGAG 


TAGGAGTAGT 


AAAACTGTAT 


2340 


GAGAAATAGC TCCTGAAGTA AGGGCGAAGA AAAGGAAAAT 


ACTGATAAAA 


ACATGAATGA 


2400 


TCAGTAGTCT AGCTAGTGAT TTCATAAGGC ACCTCCTAAT 


CCTGGTCTTT 


TTTAGCTCTT 


2460 


GCAATACGAA GTGAGTCGAC AATATGTATC ATCACTCC6A AAAAGAAAGC 


TCCCAGTATA 


2520 


GTTTTAAAAA TATGTTTTGT ATTTAGAAGA GAACTGATAA AATTTGGATT 


TTCACTTGTT 


2580 


AGGGTATCAA TGAGTGGAAT TATAAAAAAT ATCACTGTTC 


CATAAATCGA 


ACCTGCTTTC 


2640 


AGACCAGGAT AACGTAACTG TTTCTTTTCT TTTTTCATGA 


GTTTCCTCCT 


AATCCTCATC 


2700 


TTGATTTTTC TTAGTTTTTG CAATGCGACG GGAGATGAGG 


AACTGTATGC 


TCGCTCCGAA 


2760 


GAAAATAGAA CCGAGAATAC TTGATACACC ATTTCTTATA 


GTGA6AAGAG 


AATGAAAATA 


2820 


GTCCTGACCT TCATCTATGA GTATCCTGA6 AAGAGGAGTT 


ATAAAAAACA 


TCCATAGACC 


2880 


AAAGAAGAAA CCTGCTTTCA GACCTOGGTA GTGTAGTTGC 


TTGCTTTCTT 


TCTCATTCAG 


2940 


CATATCTGGT TCAATGACTG TGATGCCTGT TTTTTTCATT 


TGGTAGGTGA 


CATAGCCAGA 


3000 
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AGCGATGAGG GCAATCACTA AAATCAGAGG AGGATAGATT AGAGCCACTT CTTGAGGGTA 3060 

TTTATAGGCC AGAAGGAGTG GAATAAGATT TCCGAAAATC ATCAGATAAA AGAGGATGAT 3120 

AAAGACTTGG TTCCCAATAC TATCGGCCTC ACGCCGTTTG TATTCGTCAA GGGGACCAGA 3180 

AATACCGTAT GTGCGTTTGA TCAGTTTTTC AGTGAAGGTT TCTTTTTTCA TGAGTTTGCT 3240 

CCTTTTTTAA AAATCTTCCT CCCAAAAGAG ACTGTTGAGG TCAGTTTGGA GGCTGCGGGC 3300 

GAGATTGAGA CAGAGTTCCA AGGTTGGATT GTACTTGTCG TTTTCAATCA TATTGATAGT 3360 

CTGTCTCGAG ACACCGATAT CCTTGGCGAG TTCGAGCTGG GAAATACCCA ATTCCTTGCG 3420 

AAATTCTTTC ACACGATTCA TCTGTTCTCC TTTCTGATTT ATGTCGTATA TATTTGACTA 3480 

TATTATAGTC TTTTAAACAT AAAGTGTCAA GTATTTTTGA CATATTTTTT GAAGAAATAG 3540 

TA6TCTCCTT GTCCTATTTG TCTGACAAGT GCAAGCTGGT CGGATTTGTG GTAAAATAGA 3600 

TAAGATATGA CAAAAGAATT TCATCATGTA ACGGTCTTAC TCCACGAAAC GATTGATATG 3660 

CTTGACGTAA AGCCTGATGG TATCTACGTT GAT6CGACTT TGGGCGGAGC AGGACATAGC 3720 

GAGTATTTAT TAAGTAAATT AAGTGAAAAA GGCCATCTCT ATGCCTTTGA CCAGGATCAG 3780 

AATGCCATTG ACAATGCGCA AAAACGCTTG GCACCTTACA TTGAGAAGGG AATGGTGACC 3840 

TTTATCAAGG ACAACTTCCG TCATTTACAG GCATGTTTGC GCGAAGCTGG TGTTCAGGAA 3900 

ATTGATGGAA TTTGTTATGA CTTGGGAGTG TCTAGTCCTC AATTAGACCA GCGTGAGCGT 3960 

GGTTrrrCTT ATAAAAAGGA TGCGCCACTG GACATGCGGA TGAATCAGGA TGCTAGCCTG 4020 

ACAGCCTATG AAGTGGTGAA CAATTATGAC TATCATGACT TGGTTCGTAT TTTCTTCAAG 4080 

TATGGAGAGG ACAAATTCTC TAAACAGATT GCGCGTAAGA TTGAGCAAGC GCGT6AA6T6 4140 

AAGCCGATTG AGACAACGAC TGAGTTAGCA GAGATTATCA AGTTGGTCAA ACCTGCCAAG 4200 

GAACTCAA6A AGAAGGGGCA TCCTGCTAA6 CAGATTTTCC AGGCTATTCG AATTGAAGTC 4260 

AATGATGAAC TGGGAGCGGC AGATGAGTCC ATCCAGCAGG CTATGGATAT GTTGGCTCTG 4320 

GATGGTAGAA TTTCAGTGAT TACCTTTCAT TCCTTAGAAG ACCGCTTGAC CAAGCAATIXS 4380 

TTCAAGGAAG CTTCAACAGT TGAAGTTCCA AAAGGCTTGC CTTTCATCCC AGATGATCTC 4440 

AAGCCCAAGA TGGAATTGGT GTCCCGTAAG CCAATCTTGC CAAGTGCGGA AGAGTTAGAA 4500 

GCCAATAACC GCTCGCACTC AGCCAAGTTG CGCGTGGTCA GAAAAATTCA CAAGTAAGAG 4560 

GGAAAAAGAT GGCAGAAAAA ATGGAAAAAA CAG6TCAAAT ACTACAGATG CAACTTAAAC 4620 

GGTTTTCGCG TGTGGAAAAA 6CTTTTTACT TTTCCATTGC TGTAACCACT CTTATT6TAG 4680 

CCATTAGTAT TATTTTTATG CAGACCAAGC TCTTGCAAGT GCAGAATGAT TTGACAAAAA 4740 

TCAATGCGCA GATAGAGGAA AAGAAGACCG AATTGGACGA TGCCAAGCAA GAGGTCAATG 4800 
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AACTATTACG 


TGCAGAACX5T TTGAAAGAAA TTGCCAATTC ACACGATTTG 


CAATTAAACA 


4860 


ATGAAAATAT 


TAGAATAGCG 


GAGTAAGATA TGAAGTGGAC AAAAAGAGTA 


ATCCGTTATG 


4920 


CGACCAAAAA 


TCGGAAATCG 


CCGGCTGAAA ACAGACGCAG AGTTGGAAAA 


AGTCTGAGTT 


4980 


TATTATCTGT 


CTTTGTTTTT 


GCCATTTTTT TAGTCAATTT TGCGGTCATT 


ATTGGGACAG 


5040 


GCACTCGCTT 


TGGAACAGAT 


TTAGC6AAGG AAGCTAAGAA GGTTCATCAA 


ACCACCC6TA 


5100 


CAGTTCCTGC 


CAAACGTGGG 


ACTATTTATG ACCGAAATGG AGTCCCGATT 


GCTGAGGATG 


5160 


CAACCTCCTA 


TAATGTCTAT 


GCGGTCATTG ATGAGAACTA TAAGTCAGCA 


ACGGGTAAGA 


5220 


TTCTTTACGT 


AGAAAAAACA 


CAATTTAACA AGGTTGCAGA GGTCTTTCAT 


AAGTATCTGG 


5280 


ACATGGAA6A 


ATCCTAT6TA AGAGA6CAAC TCTCGCAACC TAATCTCAAG 


CAAGTTTCCT 


5340 


TTGGAGCAAA 


GGGAAATGG6 ATTACCTATG CCAATATGAT GTCTATCAAA 


AAAGAATTGG 


5400 


AAGCTGCAGA 


GGTCAAGGGG 


ATTGATTTTA CAACCAGTCC CAATCGTAGT 


TACCCAAAC6 


5460 


GACAATTTGC 


TTCTAGTTTT 


ATCGGTCTAG CTCAGCTCCA TGAAAATGAA 


GATGGAAGCA 


5520 


AGAGCTTGCT 


GGGAACCTCT 


GGAATGGAGA GTTCCTTGAA CAGTATTCTT 


GCAGGGACA6 


5580 


ACGGCATTAT 


TACCTATGAA 


AA6GATCGTC TGGGTAATAT TGTACCCGGA 


ACAGAACAAG 


5640 


TTTCCCAACG 


AACGATGGAC 


GGTAAGGATG TTTATACAAC CATTTCCAGC 


CCCCTCCAGT 


5700 


CCTTTATGGA 


AACCCAGATG 


GATGCTTTTC AAGAGAAGGT AAAAGGAAAG 


TACATGACAG 


5760 


CGACTTTGGT 


CAGTGCTAAA 


ACAGGGGAAA TTCTGGCAAC AACGCAACGA 


CCGACCTTTG 


5820 


ATGCAGATAC 


AAAAGAAGGC 


ATTACAGAGG ACTTTGTTTG GCXSTGATATC 


CTTTACCAAA 


5880 


GTAACTATGA 


GCCAGGTTCC 


ACTATGAAAG TGATGATGTT GGCTGCTGCT 


ATTGATAATA 


5940 


ATACCTTTCC 


AGGAGGAGAA GTCTTTAATA GTAGTGAGTT AAAAATTGCA 


GATGCCACGA 


6000 


TTCGAGATTG 


GGACGTTAAT 


GAAGGATT6A CTGGTGGCA6 AACGATGACT 


TTTTCTCAAG 


6060 


GTTTTGCACA 


CTCAAGTAAC 


GTTGGGATGA CCCTCCTTGA GCAAAAGATG 


GGAGATGCTA 


6120 


CCTGGCTT6A 


TTATCTTAAT 


CGTTTTAAAT TTGGAGTTCC GACCCGTTTC 


GGTTT6ACGG 


6180 


ATGAGTATGC 


TGGTCAGCTT 


CCTGCGGATA ATATTGTCAA CATTGCGCAA 


AGCTCATTTG 


6240 


GACAAGGGAT 


TTCAGTGACC 


CAGACGCAAA TGATTCGTGC CTTTACAGCT 


ATTGCTAATG 


6300 


ACGGTGTCAT 


GCTGGAGCCT 


AAATTTATTA GTGCCATTTA TGATCCAAAT 


GATCAAACTG 


6360 


CTCGGAAATC 


TCAAAAAGAA ATTGTGGGAA ATCCTGTTTC TAAAGATGCA 


GCTAGTCTAA 


6420 


CTCGGACTAA 


CATGGTTTTG GTAGGGACGG ATCCGGTTTA TGGAACCATG 


TATAACCACA 


6480 


GCACAGGCAA 


GCCAACTGTA ACTGTTCCTG GGCAAAATGT AGCCCTCAAG 


TCTGGTACGG 


6540 
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CTCAGATTOC TGACGAGAAA AATGGTGGTT ATCTAGTCGG GTTAACCGAC TATATTTTCT 6600 

CGGCTGTATC GATGAGTCCG GCTGAAAATC CTGATTTTAT CTTGTATGTG ACGGTCCAAC 6660 

AACCTGAACA TTATTCAGGT ATTCAGTTGG GAGAATTTGC CAATCCTATC TTGGAGCGGG 6720 

CTTCAGCTAT GAAAGACTCT CTCAATCTTC AAACAACAGC TAAGGCTTTA GAGCAAGTAA 6780 

GTCAACAAAG TCCTTATCCT ATGCCTAGTG TCAAGGATAT TTCACCTGGT GATTTAGCAG 6 840 

AAGAATTGCG TCGCAATCTT GTACAACCCA TCGTTGTGGG AACAGGAACG AAGATTAAAA 6900 

ACAGTTCTGC TGAAGAAGGG AAGAATCTTG CCCTGAACCA GCAAGTCCTT ATCTTATCTG 6960 

ATAAAGCAGA GGAGGTTCCA GATATGTATG GTTGGACAAA GGAGACTGCT GAGACCCTTG 7020 

CTAAGTGGCT CAATATAGAA CTTGAATTTC AAGGTTCGGG CTCTACTGTG CAGAAGCAAG 7080 

ATGTTCGTGC TAACACAGCT ATCAAGGACA TTAAAAAAAT TACATTAACT TTAGGAGACT 7140 

AATATGTTTA TTTCCATCAG TGCTGGAATT GTGACATPTT TACTAACTTT AGTAGAAATT 7200 

CCGGCCTTTA TCCAATTTTA TAGAAAGGC6 CAAATTACAG GCCAGCAGAT GCATGAGGAT 7260 

GTCAAACAGC ATCAGGCAAA AGCTGGGACT CCTACAATGG GAGGTTTGGT TTTCTTGATT 7320 

ACTTCTGTTT TGGTTGCTTT CTTTTTCGCC CTATTTAGTA GCCAATTCAG CAATAATGTG 7380 

GGAATGATTT TGTTCATCTT GGTCTTGTAT GGCTTGGTCG GATTTTTAGA TGACTTTCTC 7440 

AAGGTCTTTC GTAAAATCAA TGAGGGGCTT AATCCTAAGC AAAAATTAGC TCTTCAGCTT 7500 

CTAGGTGGAG TTATCTTCTA TCTTTTCTAT GAGCGOGGTG GCGATATCCT GTCTGTCTTT 7560 

GGTTATCCAG TTCATTTGGG ATTTTTCTAT ATTTTCTTCG CTCTTTTCTG GCTAGTCGGT 7620 

TTTTCAAACG CAGTAAACTT GACA6ACGGT GTTGACGGTT TAGCTAGTAT TTCCGTTGTG 7680 

ATTAGTTTGT CTGCCTATGG AGTTATTGCC TATGTGCAAG GTCAGATGGA TATTCTTCTA 7740 

GTGATTCTTG CCATGATTGG TGGTTTGCTC QGTTTCTTCA TCTTTAACCA TAAGCCTGCC 7800 

AAGGTCTTTA TGGGTGATGT GGGAAGTTTG GCCCTAGGTG GGATGCTGGC AGCTATCTCT 7860 

ATGGCTCTCC ACCAAGAATG GACTCTCTTG ATTATCGGAA TTGTGTATGT TTTTGAAACA 7920 

ACTTCTGTTA TGATGCAAGT CAGTTATTTC AAACTGACAG GTGGTAAACG TATTTTCCGT 7980 

ATGACGCCTG TACATCACCA TTTTGAGCTT GGGGGATTGT CTGGTAAAGG AAATCCTTGG 8040 

AGCGAGTGGA AGGTTGACTT CTTCTTTTGG GGAGTG6GAC TTCTAGCAAG TCTCCTGACC 8100 

CTAGCAATTT TATATTTGAT GTAAGAATGG CACCCTGATG TTTCAGGG 8148 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SBQUBNCE CHARACTERISTICS: 

(A) LENGTH: 9909 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



TACTCCACCC 


TTAATATCCG 


TTCCTGTAAA 


TACTTTACCG 


CTTTTAAGTT 


CATAGAATTG 


60 


AACTTTTAAA 


TGCTTGTCTT 


CAAGCATCTT 


TTCCATCCAA 


TTTTTAGGAG 


TTTGACCAGC 


120 


TTTAAATAAA 


AACCTTGCTG 


GGGTGATTAG 


TATAGATTTA 


TCTGCGATTT 


TATAAGCTTC 


180 


ATCAATAAAA 


TAGTGATATA 


TCGGCTCATC 


TCTGGCTTCT 


CCTGTTTCCT 


GATACGGAGG 


240 


ATTTCCTATC 


ACGACATCAA 


ATTTCATTTC 


ACTTTCCTCG 


CTAGATAGGC 


GCTCAAAACC 


300 


TATCATTCTA 


TTCTTTTTCC 


AGTCTTTGAT 


ATGGGTTTTA 


GATTCTTCTA 


CTTCTTGGAC 


360 


TTCTAGCTCA 


TCCGCAAACA AACTCAATTG 


TTGAGATTGC 


TTTTGTTTAG 


CTGAATAAGG 


420 


ACTACTTTTT 


TTCAATCCAT 


CCATCTGAAA GACATTGTAA 


GA6ATAATAG 


TCGCAATTTC 


480 


TTTCTTTTGC 


TCTAATGTTG 


GTTGATTTCC 


AGTCTTAGCT 


AGATAATAGT 


CCTCAAAAGT 


540 


TGCCAAAAGA TTCTCACGCG 


CCAAAAGGAG 


AGAATCTCCT 


TGATACTCAT 


AACCATACGA 


600 


AGCATGATAA 


GCATCTTTTA 


CAAGTTTATA 


AAATGTGACT 


TCATCTGAAA 


CCTCACGACT 


660 


AATCCGTTGC 


AGTTTTCTAT 


CAACAAAACC 


AACTCGCTCA 


GATAATGGAA 


TTTCCTCACC 


720 


AGTTACGGTA 


TCATATCTCG 


TTACCATATA 


AGGTGCTTCA 


CCACAAGTTA 


CCTCTAACCA 


780 


TCGTAAGTCC ACATACTCCT 


CAAGACTTAA 


CGAGCCTAAT 


TTCGATTCTA 


CATATCCATT 


840 


TTGCTTTGCG 


ACCAACCACG 


TTGGTGTAAA 


CACTTCTGCC 


CTTATTTTTG 


TCCGATCTTT 


OA A 


TTGTTCATAT 


TTGGATTTTT 


CAGATCTGGG 


CTGAATCAAG 


TTGGCAAAGT 


TTCCAGTAAC 


960 


CTTACTTGGA 


TTGATGCGAT 


CACTTGGAGC 


AAATCCCTTT 


CCTAACAATT 


CATAAGAATG 


1020 


CGTAnGCCAA ACAATTGATT 


TCTTTGTCGT 


TCGATCTTTT 


AAAAGAATTT 


TTAATAAGTC 


1080 


AGCCGATTCT 


TTA6CCAAAC 


TTTCTTCACT 


AATATCTATT 


GTCATCAGCA ACCTCTCTTA 


1140 


TATTGTAAGC 


CCTATTATAT 


CATATTTTAA 


AGAATGAAAA 


TTTACTTGAA 


AAAAGTAATT 


1200 


CAATAAATAT 


CTCTCCGATG 


ACCAACTTCT 


AGAGTAGCAA 


CGACTAATTC 


ATCATCTACA 


1260 


ATTTGTACGA 


TAACTCGATA 


ATTACCAATT 


CTATAGCGCC 


ATTGACCAAC 


GCGATTACCA 


1320 


ACCAAAGCCT 




TCTTGGGTCT 


TCCAAAACAT 


TGGTTTGTAA ATAGTTTGTA 


1380 


ATTAGCTTCT 


GCGTATAACG 


GTCCAATTTT 


TTCAATTGCT 


TGATAAAACG 


TCTTGTTGGA 


1440 


ACTAATTTAT 


ACAAATTATT 


CATCCTTCAA 


GCCTAAATCA 


TGCATCATTT 


CTTCCCAAGT 


1500 


AATGGGTTCA 


ACTCCTTTTT 


CCAAGTCTTC 


TAAATACTCT 


TGATAGGCTA 


AATCTGCCAC 


1560 
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ACGAGCATCG TATTCATCTT CTAGGGCTTC AAGAGTTTTG G*rGCGAATAA GTTCCGAAAG 1620 

GGAAACTCCT TCAAACTTAG CCATTGCTTT CATAAATGTT TTATCAGCTT CAGAAACTTT 1680 

TAATGTAATA GTAGTCATCT TTTGTGCTCC CTTTTTTAAT GGTAACACCA TTGTATTACT 1740 

TTTTAGGTGT TCAGTCAATA TAAAAAGAAC ACCTTCTCAG CGTTCTTTCT ATATCTCTGT 1800 

CAATGGTGTT GCGGTATCTG GTGAGGTATC ATAAACCTTA AAGTCTACTC CGACTCCCAG 1860 

ATCAGCTTGA GCCAGCTGAT TGACCATGGT CATATGAGCC AGTTCCTTGA TATTGTTTTC 1920 

CTTAGATAAA TGCCCAAGGT AAATCTTCTT AGTACGATTT CCTAGCGTCC GAATCATAGC 1980 

TTCAGCACCG TCCTCGTTAG AAAGGTGACC AAGGTCAGAT AGGATTCGTT GTTTGAGTCG 2040 

CCAAGCGTAA GAACCTGATC GCAAAATCTC TACATCATGG TTGGCCTCGA TAA6ATAACC 2100 

ATCCGCATTT TCGACAATGC CCGCCATACG GTCACTGACA TAACCTGTAT CTGTCAAGAG 2160 

GACAAAACTC TTATCATCCT TCATAAAGCG ATAGAACTGC GGTGCGACTG CATCATGGCT 2220 

TACACCAAAA CTCTCGATGT CGATATCTCC AAAGGTTTTG GTTTTACCCA TTTCAAAAAT 2280 

ATGCTTTTGC GAAGAATCCA CCTTGCCAAG ATATTTACTA TTTTCCATAG CTTGCCAGGT 2340 

CTTTTCATTG GCATAAAGAT CCATACCATA CTTGCGAGCC AAAACGCCTA CTCCATGGAT 2400 

ATGATCTGAA TGCTCATGGG TAATCAAGAT GGCATCCAGG TCTTCTGGCT TACGGTTAAT 2460 

TTCAGCTAGC AGACTGGTAA TTTTCTTGCC AGACAAGCCT GCATCTACTA AAAGCTTCTT 2520 

TTTTGAGGTT TCCAGATAAA AAGAATTTCC ACTGGAACCC GACGCTAAAA TACTGTATTT 2580 

AAAGCCTATT TCACTCATTC TAGTCTTCTA CTTCATCXTTC CCATACTTCT TCTTTCACTG 2640 

CATCCTTATC ATAAGGGAGT ACAATGGTAA AGGTTGAACC CTTGCCGTAT TCACTCTTGG 2700 

CCCAAATAAA GCCCTTATGT TGTTTGATAA TTTCTTTAGC GATAGACAGT CCTAGACCTG 2760 

TACCACCTTG TGCACGACTT CTAGCACGAT CCACACGATA GAAACGGTCA AAGATACGTG 2820 

GTAAATCCTG CTTAGGAATC CCCAAACCGT GGTCAGAAAT GGATAAAATC ATCTGGTCTT 2880 

CAGTTGTCTT CATTCTGACA GTGATTTTAC CCCCATCTGG CGAATACTTA ATAGCATTAT 2940 

TTAAAATATT GTCGACAACC TGCGTCATCT TATCTGTATC AATTTCCATC CAGATAGAAT 3000 

T6ATGG6ATA ATCTCTCACC AACTCATATT TTTTCTCCTT TTCCTGTCCT TTCATCTTGT 3060 

CAAAACGATT GAGGATAAAG GTAATAAAAG CAGTGAAGTT AATCAGTTCC ACATCTAGGT 3120 

GACTGGTAGC ATTATCAATA CGTGAAAGAT GGAGGAGATC CXSTCACCATG CGCATCATAC 3180 

GGTTGGTCTC ATCAAGAGAA ACCTTGATAA AGTCTGGTGC TACAGTTTCA CACAAAGCCC 3240 

CCTCATCCAA GGCTTCAAGA TAGGATTTTA CGCTAGTCAG AGGAGTCCGT AACTCATGGC 3300 

TAACATTGGA AACAAAGAGT CTTCGTTCGC GTTCTTCCTT CTCCTGCTCC GTCGTATCAT 3360 
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GCAAAACAGC CACCAAACCT GAAATAAAGC CAGACTCTCG ACGTATCAAG GCAAAGCGAA 3420 

CTCGAAGGTT CAAATATTCG CCATTGATAT CTTGGGAATC TAGCAACAAT TCTGGACTTT 3480 

GGGTAATCAA ATCACGCAAT TCATAGTTTT CTTCTATCTT GAGCAATTCC AAAATGCTTC 3540 

TATTCAGAAC ATCTTCCTTA ACCAACCCCA GTTGCTTCTT GGCTGTATCG TTAATCATGA 3600 

TAATCTGACC CCGACGGTTA GTCGCAAGAA CCCCATCTGT CATATAAAAC AGAATACTAT 3660 

TTAGCCTCTT ACTCTCTTGT TCTAGATTTT CCTGAGTGAG ACGAATAACC TCCGACAAGT 3720 

CATTCAAATT ATTGGTAATA TTGGTGATTT CAGACCCACC TTGCATATCA AGAACCTTGG 3780 

AATAATCTCC TGCAATCAAA TCTTTAACCT TTTGATTGAC TTGCTTCAAC TGAATATTAT 3840 

CACGTCTATT TTCCAGTAAT AAGAGGGTCA CAACAAGGAT GAAACCTAAC AAAATCAGGA 3900 

TAAAGATAAA ATCTCTGGTA AAAATGGTTT GTTTCAGTAA ATCAAGCATT ATTTCTCATG 3960 

TAATACCCTA CACCACGGCG CGTCAAGATA TACTCTGGTC GGCTGGGCGT ATCTTCAATC 4020 

TTCTCACGCA GACGTCGTAC AGTCACATCA ACTGTACGGA CATCACCAAA ATAGTCATAA 4080 

CCXCAGACAG TCTCAAGCAA GTGTTCGCGC GTGATGACTT GACCTGTATG CGATGCTAAA 4140 

TGATACAAAA GCTCAAATTC ACGATGGGTT AAGTCTAGTT CTTCGCCATA TTTTTTAGCC 4200 

ACGTAGGCGT CTGGAACAAT TTCTAAATCC CCAATTTGGA TAGGTTGAGG TTTACTATCT 4260 

GCTTCCTGAC CATCTACTGG CATAGGTTGA GAACGACGCA GAAGAGCTTT AACACGCGCC 4320 

TGCAACTCAC GATTGGAGAA GGGTTTTGTT ACATAGTCAT CTGCCCCAAG TTCCAAACCG 4380 

ATAACCTTAT CAAATTCACT ATCTTTGGCT GAAAGCATAA GAATGGGCAC ACTGCTTGTC 4440 

TTACGAATGG TCTTAGCAAC TTCTAAACCA TCAATTTCTG GAAGCATCAA ATCCAGAATA 4500 

ATAATATCTG GTTGCTCTGC TTCAAATTGC TCTAGCGCTT CACGACCATT AAAAGCAGTT 4560 

ACAACTTCGT AACCTTCCTT GGTCATATTA AACTTGATAA TATCCGAGAT TGGTTTCTCA 4620 

TCATCTACAA TTAGTATTTT TTTCATATGT TCACCTTTTT CTCTACTATT ATACCAAAAA 4680 

AATAGTCAGA AGACACAATA GCTAGTCTTG GCTACTGTCT AAGTT6GCTT GTGCATAAAC 4740 

CTGCCAGATT TTTTGTTGGG GTTTGGCAAG TGGGTAATTC TTGAATTCTT CTGGTGAAAG 4800 

CCAGCGAACT TCCCTATCTG TUiAAATCATG GAAGTCACTC ACCTGACCTG CTACAATCTG 4860 

TACATGCCAT TTTCGATGAC TAAAAACATG CTGGACTGTA TCAAAACAAA CATCAAGCCA 4920 

ATCAACATCT AGGTCATAGT CCTGCTGGAA ACTCTCTTCT GGACTGGGAC CAAAGTTCAC 4980 

ACTTTCTTCC GCAACCTGAT 6AAAGAGGTC AAACTGCTCT TCTTGCGAAA AGTTATCAAC 5040 

TTCTATAAAG 6GGAAATGCC AAAAACCTGC CAAGAGCTTT TCGCTTTCAT TTTTTTCAAG 5100 
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TAAAAATTGT CCTTGAGAAT TTTTCACAAC TAAGGCTTTA AGATAAATAG GAACCGGCTT 5160 

TTTCTTAGGA GATTTAATTG GATAACGGTC CATGGTTCCA TTCTGATATG CCGCACTAAA 5220 

GTCCTTGACT GGGCTTTCTT CAGGTCTGGG ATTTACAGGA GACTCAATAT CAGACCCTAA 5280 

GTCCATCAAG GCTTGATTAA AATCACCCGG ACGATCCGGA TTAATCAAGA TCTCCATCAT 5340 

TGCCTGAAAA ATTTTTCGAT TACTTGGAAT CCCAATATCG TGGTTGACTT CAAACAGACG 5400 

CGCCAAGACC CGCATGACAT TACCATCTAC AGCTGGCTCA GGCAAGTTAA AAGCAATACT 54 60 

GGAAATGGCT CCTGCTGTGT AAGGTCCAAT CCCTTTCAAG CTGGAAATTC CTTCATAGGT 5520 

ATTTGGAAAT TGGCCACCAA AGTCAGTCAT AATCTGCTGG GCTGCAGCCT GCATATTGCG 5580 

AACTCGAGAA TAATAGCCCA AGCCCTCCCA AGCTTTCAGT AAACTCTCCT CAGGCGCAGT 5640 

TGCCAGACTT TCGACAGTTG GAAACCAGTC CAAAAATCTT TCGTAGTAAG GGATAACTGT 5700 

ATCCACCCTG GTCTGCTGAA GCATGATTTC AGATACCCAG ATGTGATAAG GATTTTTACT 5760 

TCTCCTCCAA OGCAAATCTC TTTTGTTTTC ATCATACCAA GCGAGAAGTT TCTCACGGAA 5820 

AGAAATGACT TTCTCCTCCG GCCACAIXSAC GATACCGTAT TCTTTCAAAT CTAACATATC 5880 

TCTAGTATAA CACAGAAGGT TTCACCTGTC TTTGTATCTG ATTTATAATA TTTTCAATAG 5940 

ATAGTATATA ACTTTTCTAT CTACTTATAC TCAATGAAAA TCAAAGAGCA AACTAGGAAG 6000 

CTAGCCGCA6 GTTGCTCAAA ACACTGTTTT GAGGTTGTGG ATAGAACTGA CAGAGTCAGT 6060 

ATCATATAcT ACGGCAAGGT GAAGCTGACG TAGTTTGAAG AGATTTTCGA AGAGTATAAA 6120 

TCTTATTGAT GAACTGCTTG CAGTCTGAGA AAAAATGAGC TTGGATATTA TTTCCAAACT 6180 

CACTTAAAGT CAATTTCAAT CCACTAGAAC AAGCCTAGTA CAGTTCCATC GCTTTCAACA 6240 

TCCATGTTGA GAGCTGCTGG ACGTTTTGGA AGACCTGGCA TGGTCATAAC ATCACCAGTT 6300 

AAGGCAACGA TGAAGCCTGC ACCTAATTTT GGTACCAATT CACGAATGGT AATTTCAAAG 6360 

TTTTCTGGTG CTCCAAGCGC ATTTGGATTG TCTGAGAAAC TGTATTGAGT TTTAGCCATA 6420 

CAGATTGGCA ATTTGTCCCA ACCGTTTTGA ACGATTTGAG CAATTTGTGT TTGAGCTTTC 64 80 

TTCTCAAAGT TCACTTTGCT ACCACGATAG ATTTCAGTGA CAATTTTTTC AATCTTTTCT 6540 

TGGACAGAAA GGTCATTATC ATACAAACGT TTATAGTTAG CTGGATTTTC AGCAATTGTC 6600 

TTAACAACT6 TTTCGGCAAG TGCTACTCCA CCTTCTGCTC CATCAGCCCA GACACTAGCC 6660 

AATTCAACTG GTACATCGAT TGAG6CACAG AGTTCTTTTA AGGCT6CAAT TTCAGCTTCT 6720 

GTATCAGATA CAAATTCGTT AATAGCTACA ACTGCTGGAA TACCGAACTT ACGGATATTT 6780 

TCAACGTGGC GTTTCAAGTT AGCAAAACCT GCACGAACTG CCTCTACATT TTCTTCAGTC 6840 

AGAGCGTCTT TAGCCACACC ACCATTCATC TTAA6GGCAC GAAGGGTTGC GACAATAACA 6900 
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ACTGCATCTG GAGATGTTGG CAAGTTTGGT GTCTTGATAT CAAGGAATTT CTCAGCACCA 6960 

AGGTCCGCAC CAAAACCAGC TTCAGTAACA GTGTAATCAG CCAAGTGAAG GGCTGTTGTC 7020 

GTCGCCAAAA CAGAGTTACA GCCATGAGCG ATATTGGCAA ATGGACCACC GTGTACAAAG 7080 

GCAGGTGTAC CGTAAATTGT CTGAACCAAG TTTGGCTTAA TAGCATCCTT CAAAATCAAA 7140 

GCCAAGGCAC CCTCAACCTG CAAATCACCT ACAGAAACAG GCGTACGGTC ATAGCGATAA 7200 

CCAATAACGA TATTCGCCAA ACGACGTTTC AAGTCCTCGA TGTCCGTTGC CAAGCAAAGA 7260 

ATTGCCATGA TTTCTGAAGC AACTGTAATA TCAAAACCAT CCTCACGTGG AATACCGTTT 7320 

AGAGGACCAC CAAGACCAAC AGTCACATGG CGGAGCGTAC GGTCGTTCAA GTCCACAACG 7380 

06TTTCCAGA GGATACGACG TTGATCAATT CCCAGCTCAT TCCCTTGGTG CAAGTGGTTG 7440 

TCAATCAAGG CAGAAAGGGC ATTGTTGGCA GTTGTAATAG CATGCATATC TCCAGTAAAG 7500 

TGGAGGTT6A TGTCTTCCAT TGGCAGAACT TGTGCATACC CACCACCAGC AGCACCACCC 7560 

TTGATCCXCA TGACTGGACC AAGAGACGGT TCGCGGATAG CAATCATGGT TTTCTTGCCA 7620' 

ATCTTGTTCA A6GCATCCGC AAGACCAATG GTAAGCGTCG ACTTTCCTTC ACCTGCAGGT 7680 

GTTGGGTTGA TGGCAGTAAC CAAGATCAAT TTACCGACTG GATTGCTCTC AACTGCACGA 7740 

ATTTTATCAA AGCTGAGTTT AGCCTTGTAC TTTCCGTACA ACTCCAAATC GTCATAAGAA 7800 

ATACCAAGTT TCTCTACAAC ATCAACAATT GGCTTCAACT CAATACTCTG TGCGATTTCA 7860 

ATATCTGTTT TCATTCAAAA TTCCTCTAAC CTCTTATATG ATAATTCATT ATATCACAAA 7920 

ACAAGATTTT TAACATCCTA AAACTCTCTA AACGTTCGTA AATATCTCTG TTTTTAAGAC 7980 

TTTTAGAGTC CTTTCTTAAA TTTTATATGG CTTTATAGTT TGAAACTATA ATAAATCTTC 8040 

GTTTTTACCA AAAATTTATC ACTTTCATTT TACTTACCGC TTATTTTTGT GTACAATAGT 8100 

GCTATGAAAA TTTTAGTTAC ATCGGGCGGT ACCAGTGAAG CTATCGATAG CGTCCGCTCT 8160 

ATCACTAACC ATTCTACAGG TCACTTGGGG AAAATTATCA CAGAGACTTT GCTTTCTGCA 8220 

GGGTATGAAG TTTGTTTAAT TACGACAAAA CGAGCTCTGA AGCCAGAGCC TCATCCTAAC 8*280 

CTAAGTATTC GAGAAATTAC CAATACCAAG GACCTTCTAA TAGAAATGCA AGAACGTGTT 8340 

CAGGATTATC AGGTCTTGAT CCACTCAATG GCTGTTTCTG ACTACACTCC TGTTTATATG 8400 

ACAGGGCTTG AGGAAGTTCA GGCTAGCTCC AATCTAAAAG AATTTTTAAG CAAGCAAAAT 8460 

CATCAGGCCA AGATTTCTTC AACTGATGAG GTTCAGGTTT TGTTCCTTAA AAAGACACCC 8520 

AAAATCATAT CCCTAGTCAA GGAATGGAAT CCTACTATTC ATCTGATTGG TTTCAAACTG 6580 

CTGGTTGATG TTACCGAAGA TCATCTGGTT GACATTGCAC GAAAAAGTCT TATCAAGAAT 8640 
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CAAGCAGATT TAATCATCGC GAATGACCTG ACTCAAATTT CAGCAGATCA 6CACCGAGCT 8700 

ATATTTGTTG AGAAAAATCA GCTTCAAACA GTCCAGACTA AAGAAGAAAT TGCAGAACTC 8760 

CTCCTTGAAA AAATTCAAGC CTATCATTCT TAGAAAGGAA AACTATGGCA AACATTCTCT 8820 

TGGCTGTAAC GGGTTCAATC GCCTCTTATA AGTCGGCAGA TTTAGTCAGT TCTCTAAAAA 8880 

AACAAGGCCA TCAAGTCACT GTCTTAATGA CTCAGGCTGC TACAGAGTTT ATCCAACCTT 8940 

TGACACTACA GGTACTCTCA CAGAATCCTG TCCACTTGGA TGTCATGAAG GAACCCTATC 9000 

CTGATCAGGT CAATCATATC GAACTTGGAA AAAAAGCAGA TTTATTTATC GTGGTACCTG 9060 

CAACTGCTAA CACTATTGCA AAACTAGCTC ACGGATTTGC GGACAACATG GTAACCAGTA 9120 

CAGCTCTAGC CCTACCAAGT CATATTCCCA AACTAATAGC TCCTGCTATG AATACAAAAA 9180 

TGTATGACCA TCCAGTAACT CAGAATAATC TGAAAACATT AGAAACTACG GCTATCAGCT 9240 

GATTGCTCCT AA6GAATCCC TACTAGCTTG TGGAGACCAC GGACGAGGAG CTTTAGCTGA 9300 

CCTCACAATT ATTTTAGAAA GAATAAAGGA AACTATCGAT GAAAAAACGC TCTAATATTG 9360 

CACCCATTGC TATCTTTTTT GCTACCATGC TCGTGATACA CTTTCTGAGC TCACTTATCT 9420 

TTAACCTTTT TCCATTTCCA ATCAAACCGA CCATTGTTCA TATTCCTGTC ATTATTGCCA 9480 

GCATTATTTA TGGTCCACGA GTTGGGGTTA CACTTGGATT TTTGATGGGA TTACTTAGCT 9540 

TGACGGTTAA CACGATTACG ATTCTACCGA CAAGCTACCT CTTCTCTCCC TTCGTACCAA 9600 

ACGGAAACAT CTACTCAGCT ATCATTGCCA TCGTCCCACG TATTTTGATT GGTTTAACTC 9660 

CTTACTTAGT CTATAAACTG ATGAAAAACA AGACTGGTCT GATTTTAGCT GGAGCCCTTG 9720 

GTTCcTTGAC AAATACTATC TTTGTCCTTG GAGGAATCTT CTTCCTATTT GGAAATGTTT 9780 

ATAATGGAAA TATCCAACTTT CTTCTGGCAA CCGTTATCTC AACAAATTCA ATTGCTGAAT 9840 

TGGTCATTTC TGCAATTCTA ACCCTAGCCA TTGTTCCACG ACTACAAACC TTGAAAAAAT 9900 

AAAAAC AGG 9909 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTXCS: 

(A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TAATTTTCAT ATAATAGTAA AATAGAAT6T GTGATTCAAT AATCACCTCA AATAGAAAGG 60 

AAATTCTAT6 TCAAATCTAT CTGTTAATGC AATTCGTTTT CTAGGTATTG ACGCCATTAA 120 
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TAAAGCCAAC TCAGGTCATC CAGGTGTGGT TATGGGAGCG GCTCCGATGG CTTACAGCCT 180 

CTTTACAAAA CAACTTCATA TCAATCCAGC TCAACCAAAC TGGATTAACC 6CGACCGCTT 240 

TATTCTTTCA GCAGGTCATG GTTCAATGCT CCTTTATGCT CTTCTTCACC TTTCTGGTTT 300 

TGAAGATGTC AGCATGGATG AGATTAAGAG TTTCCGTCAA TGGGGTTCAA AAACACCAGG 360 

TCACCCAGAA TTTGGTCATA CGGCAGGGAT TGATGCTACG ACAGGTCCTC TAGGGCAAGG 420 

GATTTCAACT GCTACTGGTT TTGCCCAAGC AGAACGTTTC TTGGCAGCCA AATATAACCG 480 

TGAAGGTTAC AATATCTTTG ACCACTATAC TTACGTTATC TGTGGAGACG GAGACTTGAT 540 

GGAAGGTGTC TCAAGCGAGG CAGCTTCATA CGCAGGCTTG CAAAAACTTG ATAAGTTGGT 600 

TGTTCTTTAT GATTCAAATG ATATCAACTT GGATGGTGAG ACAAAGGATT CCTTTACAGA 660 

AAGTGTTCGT GACCGTTACA ATGCCTACGG TTGGCATACT GCCTTGGTTG AAAATGGAAC 720 

AGACTTGGAA GCCATCCATG CTGCTATCGA AACAGCAAAA GCTTCAGGCA AGCCATCTTT 780 

GATTGAAGTG AAGACGGTTA TTGGATACGG TTCTCCAAAC AAACAAGGAA CTAATGCTGT 840 

ACACGGCGCC CCTCTTGGAG CAGATGAAAC TGCATCAACT CGTCAAGCCC TCGGTTGGGA 900 

CTACGAACCA TTTGAAATTC CAGAACAAGT ATATGCTGAT TTCAAAGAAC ATGTTGCAGA 960 

CCGTGGCGCA TCAGCTTATC AAGCTTGGAC TAAATTAGTT GCAGATTATA AAGAAGCTCA 1020 

TCCAGAACTG GCTGCAGAAG TAGAAGCCAT CATCGACGGA CGTGATCCAG TCGAAGTGAC 1080 

TCCAGCAGAC TTCCCAGCTT TAGAAAATGG TTTTtCTCAA GCAACT 1126 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2520 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double 

(D) TOPOLOGY: linear 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CCGGCAACAA AAAAGAAAAA ATCAACAGTT AAAAAAAATC TAGTCATCGT GGAGTCGCCT 60 

GCTAAGCCAA GACGATTGAA AAATATCTAG GCAGAAACTA CAAGGTTTTA GCCAGTGTCG 120 

GGCATATCCG TGATTTGAAG AAATCCAGTA TGTCCGTCGA TATTGAAAAT AATTATGAAC 180 

CGCAATATAT TAATATCCGA GGAAAAG6CC CTCTTATCAA TGACTTGAAA AAAGAAGCTA 240 

AAAAAGCTAA TAAAGTTTTT CTCX3CGAGTG ACCCGGACCG TGAAGGAGAA GCGATTTCTT 300 

GGCATTTGGC CCATATTCTC AACTTGGATG AAAATGATGC CAACXTGTGTG GTCTTCAATG 360 
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AAATCACCAA GGATGCAGTC AAAAATGCTT TTAAAGAACC TCGTAAGATC GATATGGACT 420 

TGGTCGATGC CCAACAAGCT CGTCGGATCT TGGATCGCTT GGTAGGGTAT TCGATTTCGC 480 

CTATTTTGTG GAAGAAGGTC AAGAAGGGCT TGTCAGCAGG TCGCGTTCAG TCCATTGCCC 540 

TTAAACTCAT CATTGACCGT GAAAATGAAA TCAATGCCTT CCAGCCAGAA GAATACTGGA 600 

CAGTTGATGC TGTCTTTAAA AAGGGAACCA AACAATTTCA tGCTTCCTTC TATGGAGTAG 660 

ATGGTAAAAA GATGAAACTG ACCAGCAATA ACGAAGTCAA GGAAGTCTTG TCTCGTCTGA 720 

CGAGTAAAGA CTTTTCAGTA GATCAGGTGG ATAAGAAAGA GCGCAAGCGC AATGCTCCTT 780 

TACCCTATAC CACTTCATCT ATGCAGATGG ATGCTGCCAA TAAAATCAAT TTCCGTACTC 840 

GAAAAACCAT GATGGTTGCC CAACAGCTCT ATGAAGGAAT TAATATCGGT TCTGGTGTTC 900 

AAGGTTTGAT TACCTATATG CGTACCGATT CGACTCGTAT CAGTCCTGTA GCGCAAAATG 960 

AGGCGGCAAG CTTCATTACG GATCGTTTTG GTAGCAAGTA TTCTAAGCAC GGTAGCAAGG 1020 

TCAAAAACGC ATCAGGTGCT CAGGATGCCC ATGAGGCTAT TCGTCCGTCA AGTGTCTTTA 1080 

ATACACCAGA AAGCATCGCT AAGTATCTGG ACAAGGATCA GCTTAAGCTA TATACCCTTA 1140 

TCTGGAATCG TTTTGTGGCT AGCCAGATGA CAGCGGCCGT TTTTGATACC ATGGCTGTTA 1200 

AATTGTCTCA AAAAGGGGTT CAATTTGCTG CCAATGGTAG TCAGGTTAAG TTTGATGGTT 1260 

ATCTTGCCAT TTATAATGAT TCTGACAAGA ATAAGATGTT ACCGGACATG GTTGTTGGAG 1320 

ATGTGGTCAA ACAGGTCAAT AGCAAACCAG AGCAACATTT CACCCAACCG CCTGCCCGTT 1380 

ATTCTGAAGC AACACTGATT AAAACCTTA6 AGGAAAATGG GGTTGGACGT CCATCAACCT 1440 

ACGCXSCCAAC CATTGAAACC ATTCAGAAAC GTTATTATGT TCGCCTGGCA GCCAAACGTT 1500 

TTGAACCGAC AGAGTTGGGA GAAATTGTCA ATAAGCTCAT CGTTGAATAT TTCCCAGATA 1560 

TCGTAAACGT GACCTTCACA GCTGAAATGG AAGGTAAACT GGATGATGTC GAAGTTGGAA 1620 

AAGAGCAGTG GCGACGGGTC ATTGATGCCT TTTACAAACC ATTCTCTAAA GAAGTTGCCA 1680 

AGGCTGAAGA AGAAATGGAA AAAATCCAGA TTAAGGATGA ACCAGCTGGA TTTGACTGTG 1740 

AAGTGTGTGG CAGTCCAATG GTCATTAAAC TTGGTCGTTT TGGTAAATTC TACGCTTGTA 1800 

GCAATTTCCC A6ATT6CCGT CATACCCAAG CAATCGTGAA AGAGATTGGT GTTGAGTGTC 1860 

CAAGCTGTCA TCAGGGACAA ATTATTGAGC GAAAAACCAA GCGTAATCGC CTATTCTATG 1920 

GTTGCAATCG CTATCCAGAA TGTGAATTTA CCTCTTGGGA CAAGCXTTGTT GGTCGTGACT 1980 

GTCCAAAATG TGGCAACTTC CTCATGGAGA AAAAAGTCCG TGGTGGTGGC AAGCAG6TTG 2040 

TTTGTAGCAA AGGCGACTAC GAGGAAGAAA AGATGGCTCT TTGTCAACTG TAGTGGGTTG 2100 

AAGTCAGCTA AGCTCGAGAA A6GACAAATT TTGTCCTTTC TTTTTTGATA TTCAGAGCGA 2160 
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TAAAAATCCG TTTTTTGAAG TTTTCAAAGT TCCGAAAACC AAAGGCATTG CGCTTGATAA 2220 

GTTTGATGAG ATTATTGGTC GCTTCCAATT TGGCGTTAGA ATAGTGTAGT TGAAG6GCGT 2280 

TGACGATTTT CTCTTTGTCC TTTAGAAAGG TTTTAAAGAC AGTCTGAAAA AGAGGATGAA 2340 

CCTGCTTTAG ATTGTCCTCA ATGAGTCCGA AAAATTTCTC CGGTTCCTTA TTCTGAAAGT 2400 

GAAACAGCAA GAGTTGATAG AGCTGATAGT GATGTTTCAA GTCTTGTGAA TAGCTCAAAA 2460 

GCTTGTTTAA AATCTCTTTA TTGGTTAAAT GCATACGAAA AGTAGGGCGA TAAAAATGTT 2520 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TTTTCTCGAT AATAACTTCC ACCTTATTAT TTGGGATACC CTCCTCTTCT TCACCACCAC 60 

GTTCATAGTA GTCATCGCGA TA6AGAAAA6 CTACGATATC AGCGTCCTCC TCAATAGACC 120 

CAGATTCACG AATATCAGAC AAGACCGGTC TCTTGTCCTG ACGTTGTTCT ACACCACGAG 180 

AAAGCTGACT CAGAGCGATT ACTGGAACCT TCAATTCCTT GGCTAGTATT TTCAACTGAC 240 

GAGAAATTTC AGAAACTTCT TGTTGACGAT TTTCTCGACC AGTTCCCGTG ATAAGTTGCA 300 

AATAGTCTAT CAAAATCAAA CCAAGATTTC CA6TTTCTT6 AGCCAATTTA CGAGAACGAG 360 

AACGAATCTC TGTAATCCGA ATACCTGGCG TATCATCGAT ATAGATACTG GCGTTAGcTA 420 

GATTACCCTG AGCAATAGTA TATTTTTGCC ACTCCTCATC TGTCAATTGC CCTGTACGGA 480 

TA6AATGTGA CTCCACTAAG CCTTCTGCAG CTAACATACG ATCTACCAAG CTTTCCGCAC 540 

CCATTTCGAG TGAAAAAATA GCAACCGTTT TGTCCAACTT AGTCCCAATG TTCTGAGCGA 600 

TATTCAAGGC AAATGCTGTC TTACCAACTG CTGGACGAGC TGCTAAGATA ATCAACTCCT 660 

CCTCATGAAG TCCTGTTGTC ATATGATCCA AATCACGATA ACCTGTCGCA ATACCTGTAA 720 

TATCGGTCGT TTGTTGCGAG CGAGCTTCCA GATTTCCAAA GTTGAGATTC AACACATCTC 780 

GAATGTTCTT AAACCCGCTT CGATTTGCAT TTTCACTGAC ATCAATCAAC CCTTTTTCTG 840 

CCTGAGCAAT AATTTCATCA GCTGGTTGTG ACGCTTCGTA AGCTTGGTTG ACAGACTCTG 900 

TCAACTTGGC AATTAAACGA CGTAGCATTG CTTTTTCTGC AACAATCTTA GCATAATACT 960 

CCGCATTAGC AGAA6TTGGC ACAGAATTAA CAATCTCAAC CAAGTAAGAC AAGCCACCAA 1020 
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TATTCTGTAA ATCACCTTGA TTATCAAGGA TAGTACGAAC CGTTGTTGCA TCTATGGCAT 1080 

CACCACGATC GGATAAATCG ACCATGGCTT GGAAAATCAA ACGATGGGCA TACTTAAAAA 1140 

AGTCCCGAGA CTCAATGTAT TCTCGCACAA AAACAAGTTT ACTCTCATCA ATAAAGATAG 1200 

CCCCTAAAAC GGATTGCTCA GCTAAGATAT CTTGAGGTTG TACTCGTAAC TCTTCTACTT 1260 

CTGCCATCAG ACTTCCCTTC CTTTTACAAT CTTGTCAAGA AGGTGTAAAC TTATCCTTCT 1320 

TTCACACGAA GATTGATTAC ACTTGTGATA TCTTGATAGA TTTTCACTGG CACATCAATC 1380 

AAACCAACCG CTCGAATCGG AGCTTGTACT TGAATATGAC GTTTATCAAT CTTAATTCCA 1440 

AATTGCTTTT GCAATTCTTC TGCAATCTTC TTATTGGTAA TAGAACCAAA GGTACGACCA 1500 

TCTGGACCAA CTTTTTCAAC AAATTCTACA ACAGTTTCTT CTGCTTCAAG TTGTGCTTTA 1560 

ATTGCTTTTC CTTCTGCAAT CATCTCAGC6 TGAGCTTTTT CTTCCGATTT TTGTTTACCA 1620 

CGAA6TTCAC CTACAGCTTG AGCAGTCGCT TCTTTGGCTA GATTCTTTTT GATAAGAAAG 1680 

TTTTGCGCAT ACCCTGTTGG TACTTCCTTA ATTTCGCCTT TTTTACCTTT TCCTTTAACA 1740 

TCTGCTAAAA AGATTACTTT CATTCTTCTT TCTCCTTTTC CTTCATTTCA TTTAATACAA 1800 

TTTCTGTCAG TTTTTCACCT GCTTCTGACA AGGTTACATC TTTAATTTGA GCTGCTGCCA I860 

AATTAAAGTG GCCTCCACCG CCTAACTCTT CCATAATCCG TTGTACATTC AGTTTACTAC 1920 

GACTTCGAGC TGAGATAGAG ATAAATCCTT GTGTATTCTT CGCAAGAACA AAACTCGCTT 1980 

CAATACCTGA CATGGCTAAC ATGGCATCTG CTGCCTTACT AATAACAACT GTATCATAGC 2040 

ATTTCATGTC CTTAGCCTCT GCTATTAGTA CATCTGAACC TAATTTACGC CCCTGTAAAA 2100 

TAAGTTCATT GACCTCACGA TATTCTTCAA AATCTGTCGC AGCGATTTCC TGGATAGCAA 2160 

TACTATCACT TCCGC6CGTT CTGAGATAGC TAGCAACATC AAATGTCCGA CTAGTTACTC 2220 

GCGAGGTGAA ATTTTTAGTA TCCAACATCA TACCAGCCAT CAAGACACTT GCTTCCATAC 2280 

GACTCAAACG ATTTTTCTTA GAATTCTGGA ACTGAATCAA TTCCGTTACC AACTCACTGG 2340 

CACTACTTGC ACCACTTTCG ATATAAGTAA TAACCGCATT ATCTGGAAAA TCCTGATCCC 2400 

TTCTATGGTG GTCAATAACA ATGGTTTGGG TAAATAAATC ATAAAATTCT TTTGATAATG 2460 

TTAAGGCT6T CTTTGAATGG TCTACAAGAA TCAACAAAGA ACGATTGGTC ACCATCCCCA 2520 

TTGCATCCTT AACAGACAAC AACTTCGTAA CTCCTTCTTT TTCTATGAAT GAAACAGCTC 2580 

GTTCAATATC TGGAGACATT TGTTCTTCAT CATAAAGAGC ATAGCTATTT TCAATCACAT 2640 

TGCTGGCGAA CAACTGCATA CCTACAGCAG AGCCCAAAGC ATCCATGTCT AAATTTTTGT 2700 

GACCGACTAC AAAAACCTGA TCTACACTCC GAATCTTATC TGAAATAGCT GTCATCATAG 2760 

CGCGCGTACG AGTCCGTGTA CGCTTGATTG AAGCAGCAGA CCCACCACCA AAATAAACTG 2820 
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GATTTTTCGT TTCGTCGTTT TCCTTAACAA CCACCTGGTC GCCACCACGT ACTTCAGCCA 2880 

AGTTCAAATT GAGCAAAGCA ACTTTCCCTA TCTCATCATG ATTTCCATC6 CCATAAGAAA 2940 

ATCCCATACT TAAGGTCAAG GGCAACTGTC TCTGTTTCGA CTCTTCTCTG AAAGCATCAA 3000 

TAACAGAAAA TTTATCATTC ATCAAGCCCT CAAGCACCGT GTAGTCAGTA AATAGATAAA 3060 

ATCGATCCAT ACTTACCCGA CGAGAAAACA TCATGTGTTT TTCTGAAAAC TCTGATATAA 3120 

AATTAGCTAC AT^CTATTG ATTTGACTAA TATCTGACTC AGAAGTTTCA TCCTCCAAAT 3180 

CATCATAATT ATCCACAGAG ACAATCCCAA TCACTGGTCT ACTTGTTACC AATTCATCTG 3240 

TTATGGCTTG TTCCCTGGAT ACATCTACAA AATACAAAAC ACCGGAAGAA GCATCCATAT 3300 

GAACAGCATA ACGCTTCTCA CCAAGCTTGG CATAAGTAGA CGGATTTCCT ACTGAAGCCT 3360 

TGATAATCGT TTGAACAGCr TCTAAATCAA AATCACCATC TTCCTTGGTC AAAATCAATT 3420 

CAGCATAGGG ATTAAACCAC TCAACCTCTC CAGAAGATAA ATTCAATTTC ATAACACCTA 3480 

CAGGCATCTG TTCCAATAGA GCTGTCAAAC TTTCTTCCGC TTGGTGGTTT ACATACTGTA 3540 

TCTGTTCTAC ATCACTCCTT GTATAATGCA CTCTCAGTTT CTTAAATAAA AAAACATAGC 3600 

CTCCTACAAA AAGAAACAAA ATTAAAACCG TCAACAGATT ATTATTAACA AAAATAATGA 3660 

AAGTGGATAA GACTCCAAAC GCAATCAATC CTACTAGAAT AGGAAAAATT GGACTTACAT 3720 

AAAATTTTTT CATTCAAAAC CTCTTGGCAC CCATTATACC ATAATACCCC TCAAAAAGCG 3780 

ACTTTTTAAA AGTGTAATCA GTAATTCTAT CAATTATAAG AAAAAGGTAG TTTACAATTC 3840 

AGTAAACCTA CCTTTACACA TATTGAAATT AAGATTCTTT AACCTCTAAC AAACCAATTT 3900 

CGCCATCCTC ACGACGATAA ATCACATTGG TTGTCTGATC TTCAACATCC ACATAGATAA 3960 

AGAAATCATG CCCCAATAAA TCCATTTGTA GAATTGCTTC TTCCAAATCC ATTGGTTTTA 4020 

AATCAATTTG TTTTGAACGA ACAACTTTAG ACTGGACAAT ATTTGAATCT TCCACCAAAG 4080 

CATCTGTAAA TAATTGACCA GTTGCTACCT TATTTTTATT TTTACGCTCG ATTTTTGTTT 4140 

TATTTTTACG AATCTGACXST TCAATTTTAT CAGTTACAAG GTCAATTGAA CCATACATAT 4200 

CTTGAGATAC ATCTTCTGCG CGGAGAGTAA TAGATCCAAG CGGAATCGTT ACTTCCACTT 4260 

TAGCCGTTTT TTCACGATAA ACTTTTAAGT TAATTCGGGC ATCCAACTCT TGTTCTGGTT 4320 

GGAAGTACTT TTCGATCTTT TCGAGTTTAG AAACTACATA ATCACGAATT GCTTCTGTTA 4380 

CTTCTAGGTT TTCACCACGG ATACTATATT TAATCATATG AGTACCTTCT TTCTAAACAT 4440 

TTTTGTTTTT ATGATTTTAT TATAACGCTT TCATTCTATT TTTGCAAATT TTTTCCTCAT 4500 

CTTACAAGGG AAAATGTTTT TACATCCTTA GCACCAGCTT CTTCCAACAG TTTCTTAACA 4560 
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CGATTTATAG 


TTGCTCCTGT 


AGTATAGATA 


TCATCTATAA 


GTAGGATTTT 


TTTAGGAATA 


4620 


GTGACTCCAC TTTTAATAAA GAAAGGAAGT TCTGTCCCCA AGCGCTCTGA ACGATTTTTA 


4680 


GAAGAACTGG 


CTCTCTCTTC 


TCTTTTCTCT 


AATAAATCCA 


GATACTCAAA 


GCCTGCTGCC 


4740 


TCTACCAAGC 


CCTCAACCTG 


ATTAAATCCT 


CTATTAGCAT 


ATCTATCAGG 


ACTTAGGGGA 


4800 


ATTACAACAA 


ATTGATACTC 


TTTGTACTTT 


TTCAACTCCT 


CACTTAAAAA 


TGAAGCGAAA 


4860 


ACTTTTCTTA ACAGGAAGTC 


TCCATCAAAC 


TTATACCGAC 


TGAAAAAATC 


CTTCATAGCT 


4920 


TGATTGTAAG 


TAAAAATCGC 


TCTATGACTG 


ACTTCAACTC 


CCTCTTTACA 


CCAAAGTTGA 


4980 


CAATCTTGAC 


ACTTTGriXSA CAACTCTGTT 


TTCATACAAT 


TTGGACAGTT 


CTCTTCCCCA 


5040 


ATTCTTTCAA 


AAGTAGAATC 


ACAGTCTGAA 


CAAAGACAA6 


AGTCATCATT 


CCTCAGAAGT 


5100 


AAGAGACTAC 


TAAAAGTTAA 


AACAGTCTTC 


ATAGTCTGCC 


CACATAACAA GCACTTCATA 


5160 


GACCAGCCTC 


CTTATTCATC 


ATCTGAATTT 


CCTTAATCGC 


CTTCTTGATT 


GAAGCATTTA 


5220 


ACCCATCATG 


GAAGAAAAGC 


AAATCTCCTG 


TCGGTCTATC 


CATGCTTCGT 


CCAACTCGTC 


5280 


CACCAATCTG 


AATCAAACTA 


GACTTGGTAA 


ACAAACGATG 


ATTGGCCTCT 


ACTACGAAAA 


5340 


CATCCACACA 


AGGGAAGGTA 


ACTCCGCGCT 


CCAAGATTGT 


CGTACTGATA 


AGTATTGTCA 


5400 


GTTCTCCATC 


TCGAAAA6CT 


TGTACTTGCT 


CTAATCGATC 


CTCTGTTACA 


GAAGATACAA 


5460 


AGCCAATTTT 


CTCATTTGGA 


AATTGCTCCT 


GTAAGATTTC 


TGCTAACTGC 


TCCCCTTTCT 


5520 


TAATTTCTGA AGCAAAAATG 


AGTAACGGAT 


AAGCTGTCTT 


TCTCTGCTTC 


TCAATATAGG 


. 5580 


ACTTTAACTT TGGTGACAAA 


CGATTCTTGT 


CTAAGTAGCG 


ATTAAAATCC 


GATAACCAAA 


5640 


TTGGTTTTGG AATAATCAAC 


GGATTTCCAT 


GAAACCGTCT 


CGGTAAATTC 


AGTCTTTTTA 


5700 


GTTCTCCTAA ACGGACCTTT 


TTATCTAACT 


CATTGGTCGA 


AGTCGCTGTT 


AAAAAGATTC 


5760 


TCAATCCATT 


CTCCTTTACA 


CTATTCTTGA 


CAGCGTGGTA 


AAGCATGGGA 


TTATCAACAT 


5820 


AAGGAAAAGC 


ATCTACTTCA 


TCCACTATCA 


GCAAATCAAA 


AGCTTGATAA 


AACTTCAATA 


5880 


ACTGATGGGT 


TGTTGCAACA 


ACTAGTGGTG 


TTCGAAAATA 


AGGTTCCGAT 


TCTCCATGTA 


5940 


GCAAAGCTAT CCCX3CAAGAA 


AAATCCTGTT 


GCAGGCGCTT 


GTACAGCTCC 


AAACAAACAT 


6000 


CTATGCGAGG ACTAGCCAAA 


CACACTGCAC 


CACCCGCATT 


GATCACTTTA 


GCCACTACTT 


606O 


GATAAATCAT TTCTGTCTTT 


CCAGCTCCTG 


TTACCGCATG 


AACTAAGGTT 


GGCTTTTGCT 


6120 


TGTCTACTAC 


TTGAAGCAAT 


CCCTCTGACA CCTTCTCTTG 


AAAAGGAGTT 


AATTGGCCGC 


6180 


GCCATTTGAG AACATCTT6C 


TTTGGAAAAT 


CCTCCTGCGG 


AAAATAGTAT 


AAAGTTTGAT 


6240 


CACTTCTGAC TCGCTTCATC AGCAAGCACT CTCGACAATA GTAAGCACCG ATGGGCAAAT 


6300 


ACCATTCTTC TAGAATAGTA 


CTATTACAGC 


GTTGACAGAA 


AAGTTTCCCC 


TTCTCCTTTC 


6360 
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TCATTGCTGG AAGTTTCTCC GCCAACTGAC GTTCTTCTTC TGTTAATTCA TTCTCAGTAA 6420 

ATAAACX3ACC GAGATAATCT AAATTTACTT TCATACTTCT TTATTCGTAA AAACTAGCAC 6480 

TTTAGATGAT TTTTTAGTAC AATTAAATCA TGGAATTTAG GACAATTAAA GAGGACGGTC 6540 

AAGTCCAAGA AGAAATCAAA AAATCTCGCT TTATCTGCCA TGCCAAGCGT GTTTATAGCG 6600 

AAGAAGAGGC TCGTGACTTC ATTACTGCCA TCAAAAAAGA ACACTACAAA GCGACACATA 6660 

ACTGCTCTGC CTTCATTATT GGAGAACGTA GTGAAATTAA ACGTACAAGT GATGATGGTG 672 0 

AGCCTAGTGG TACTGCTGGT GTTCCCATGC TTGGGGTACT AGAAAATCAC AATCTCACCA 6780 

ATGTCTGTGT GGTCGTGACA CGCTACTTTG GTGGTATTAA ACTAGGCGCT GGAGGACTAA 6840 

TTCGTGCTTA CGCCGGCAGT GTCGCCTTAG CTGTCAAAGA AATTGGTATT ATTGAAATAA 6900 

AAGAAGAGGC TGGCATTGCT ATTCAAATGT CTTATGCTCA GTACCAAGAG TACAGTAACT 6960 

TCCTTAAAGA ACATGGTCTC ATGGAGCTGG ATACAAACTT TACAGATCAA GTCGATACGA 7020 

TGATTTATGT TGATAAAGAA GAAAAAGAAA CTATTAAAGC TGCACTTGTG GAGTTTTTTA 7080 

ATGGAAAAGT CACTTTAACT GACCAAGGTT TACGAGAGGT TGAAGTTCCT GTAAACTTAG 7140 

TGTAAACAAT GAATAATACA GCGTTTCGTT GACATTCTCA CAACTACTTT AGCGAGCAAA 7200 

ATAAAAAGAG GCGTACCAAA ATATACTAGA AAATGAAGCA ATTCAAACGA AACCTGATAT 7260 

CGTTTTCCTT CACACCTATT TACTAGAATT AGCTGAACGC AATCACTTGA AAATTAATGA 7320 

CTTTGATCTA TGATATATAG AAATGGTATG GATAGCGTTA TACTAAAGAT ATCTTATACA 7380 

AAGAGGTATT CATATGTCTA TTTATAACAA CATTACTGAA TTAATCGGTC AAACACCGAT 7440 

TGTTAAACTT AACAACATCG TGCCAGAAGG TGCTGCAGAC GTCTATATAA AGCTTGAAGC 7500 

ATTTAATCCT GGTTCATCTG TAAAAGACCG TATTGCCCTT AGCATGATTG AAAAAGCTGA 7560 

ACAAGATGGT ATTCTGAAAC CTGGTTCTAC TATTGTTGAA GCAACAAGTG GAAACACCGG 7620 

TATTGGACTT TCATGGGTAG GTGCTGCTAA AGGGTATAAA GTCGTCATCG TTATGCCTGA 7680 

AACTATGAGT GTAGAACGAC GTAAAATTAT CCAAGCTTAT GGTGCTGAAC TCGTCCTAAC 7740 

TCCTGGTAGC GAGGGAATGA AAGGTGCTAT TGCTAAGGCT CAAGAAATCG CTGCTGAACG 7800 

TGATGGTTTC CTTCCTCTTC AATTTGACAA TCCAGCTAAT CCAGAAGTAC ACGAAAGAAC 7860 

AACAGQAGCT GAGATACTAG CTGCTTTCGG TAAAGATGGA TTAGATGCCT TTGTTGCTGG 7920 

AGTAGGTACT GGTGGAACGA TTTCTGGTGT TTCTCATGCA CTCAAATCAG AAAATTCTAA 7980 

CATTCAAGTT TTTGCAGTAG AA6CAGATGA ATCTGCTATT CTATCTGGTG AAAAACCTGG 8040 

TCCTCACAAA ATTCAACGTA TCTCAGCTGG ATTTATTCCT GATACACTTG ATACTAAAGC 8100 
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CTATGATGGT ATCGTTCGTG TAACATCAGA TGACGCTCTT GCACTCGGAC GTGAAATTGG 8160 

TGGAAAAGAA GGCTTCCTTG TAGGGATTTC CTCAGCTGCA GCTATCTACG GAGCCATCGA 8220 

GGTTGCCAAA AAATTAGGTA CAGGTAAAAA AGTCCTTGCC CTAGCACCAG ATAACGGTGA 8280 

ACGTTATCTC TCTACAGCAC TTTATGAATT GTAACCGTCC AATAACGAAG TCTATTGAAA 8340 

AATCTCCAGA CTAGAGAACT CACGGATAGT TCCTAATCTG GAGATTTCTT ATTTGCACTT 8400 

TTCTTGTACA ACTTTAGTCC ATGGTAAATA GGCCTCTAAA ACCTCTTTGT TTACGAGAGT 8460 

TTCCACGTTT GGAAGACATT CTAGAAGATA GGATAGATAT TTCTCACTAT TTATAATGGA 8520 

TTGAAATAAG ATATGAACAA ATCGATTAGA ACATGATGGT AAAGCGTAAT CCCTTGTTTC 8580 

TCAGCTTTCC CAGACAAAAA AGTCCAATAG TAAGTCAGCT GACTATCACT CTCTAGCACC 8640 

CTATAAGAAG TTTCATCCGC ATGAAGTAAG GGCTGAGTCA ATAGTCTCTC TCGCAAGAGG 8700 

TTATAAAGGG GCTCCAAATA GTATTGACTC GTCTTGATAT GCCAATTAGA GATTTCCTTA 8760 

CGTGTGATTG GTAAACCCAT CCTAGCCCAA TCTTCTTCTT GGCGATAATT GQGTACCTTC 8820 

AGATTAAACT TCTGATGGAT GGTGTGAGCG ATAATAGAAG CTGAGCCAAA GTTATGCGCT 8880 

AAAGGGGCTT TAGGAATAGG AGCTTTCACA AGCTTATCCA GATGATTATC TTTTACTCGT 8940 

TATGGACAAT GCTATATGGC ATAAATCAAG TACCTTAAAG ATTCCGACTA ATATTGGCTT 9000 

TGCATTTATT CCTCCATACA CACCAGAGAT GAACCCCATT GAACAAGTGT GGAAAGAGAT 9060 

TCGTAAACGT GGATTTAAGA ATAAAGCCTT TCGAACTTTG GAAGATGTCA TACAAGGACT 9120 

GGAGAAGGAG GTGATAAAGT CCATCGTTAA TCGGAGACGG ACTAGAATGC TTTTTGAAAA 9180 

CAGATGAGTA TAAAAAGAAA GTCXTTCATTT CAATAGAAAT CACGACTTTC TGATGAATTT 9240 

ATAGTAAAAT GAAATAAGAA CAGGATAGTC AAATCGATTT CTAACAATGT TTTAGAAGCA 9300 

GAGGTGTACT ATTCTAGTTT AAATCCACTA TATTTGGGGA GTGATAGAAA AGCCCTTCAT 9360 

CAGCCAATCT ACTTGTTCAG GTGCGAGAGC TTTGACATCC TTTTCTGTAC TGGACCAAGT 9420 

CAGTTTTCCG TTCTCAAAGC GTTTATATAA TATCCAAAAT CCTTGACCAT CCCAGTAAAG 9480 

AACTTTAAAG CGGTCTTTAC GTCCACCACA AAAGAGAAAG ACTTGATCGG AGAAAGGATC 9540 

CAATTCAAAG TGGGTTTTAA CTACATAGGC TAATGAGTCT ATTCCCTGCC TCATATCTGT 9600 

CTTGCCACAA ACAAGGTGAA CTTGACCTAA ATCACTTAGT TGAATTATCA TAGTACAATA 9660 

CCTTTCCTCC GATAATTATT TTTTATCTGG TATACTGGAA GTTGGGGAAT TAGGATAGAT 9720 

ACCTTGTTAT GACGCGCTTA CTATGAATTT GAAGTATAGT CTCCTAAATG CACTTAGCCC 9780 

TTATTATAGG GCTTTTT G TT TTAATTATTC TAATCGAGTG AGACTGGGGA AAAAACAATT 9840 

TCAGGAAAAA TCTAAGCCCT ATACAAAAAA GGAAGCAATT TGCTTCCTTT CTATTATTAG 9900 
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TTATTCAAGG CTGCTGCCAT TGTAGCTGCA ACTTCAGCTT CXSAAGTCGTT TGCAGCTTTC 9960 

TCGATACCTT CACCAACTTC AAAGCGAGCA AACTCAACTA CCGAAGCGTT AACTGATTCA 10020 

AGGTATGCTT CAACTGTCTT GCTGTCATCC ATGATGTAAA CTTGTGCAAG AAGTGTGTAA 10080 

GCTTGGTCAA CTTTAGTGTT ATCAAGCATG AAGCGATCCA TTTTACCTGG AATAATTTTG 10140 

TCCCAGATTT TTTCTGGTTT GCCTTCTGCA GCCAATTCAG CTTTGATGTC AGCTTCAGCT 10200 

TGAGCAATAA CATCATCAGT TAATTGAGCT TTTGATCCAT ACTTCAAGTG TGGAAGAGCT 10260 

GGTTTATTAA CCATTGCACG GCTTTCGTTG TCTTGGTCGA TAACGTGATT CAATTGTGCC 10320 

AACTCATCTT TAACGAATTG CTCATCCAAT TCTTTGTAAG AAAGAACTGT TGGTTTCATC 10380 

GCTGCGATGT GCATTGACAA TTGTTTAGCA AGTGCTTCGT CTCCACCTTC AACAACTGAA 10440 

ATAACACCGA TACGTCCACC GTTATGTTGG TATGCTCCAA AGTGTTGTGC GTCTGTTTTT 10500 

TCAATCAATG CAAAGCGACG GAATGAOATT TTCTCTCCGA TAGTTGCTGT TGCAGATACG 10560 

TATGCAGCTT CAAGAGTTTC ACCTGAAOGC ATTATCAAAG CAAGAGCTTC TTCGTTGTTA 10620 

GCAGGTTTTC CTTCAGCAAT GACTTTAGCT GTAGTATTTA CCAATTCAAC GAATTGAGCG 10680 

TTTTTTGCAA CGAAGTCAGT TTCAGCGTTT ACTTCAATAA CTGCTGCAAC ATTACCGTTA 10740 

ACATAAACAC CAGTCAAACC TTCTGCAGCA ACACGGTCAG CTTTCTTAGC TGCCTTAGCC 10800 

ATACCTTTTT CACGAAGCAA TTCTU^TCGCT TTTTCGATGT CACCGTCTGT TTCTACAAGC 10860 

GCTTTTTTAG CGTCCATAAC ACCGGCACCA GATTTTTCAC GCAACTCTTT TACAAGTTTA 10920 

GCTCTAATTT CTGCCATTTT AATTCTCCTA TATTTTTTGA AAATAGGAGA GCGCGGCTAA 10980 

GCCCCGCCTC CGG 10993 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8411 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBI»«ESS: double 
<D} TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CGACGGGGAG GTTTGGCACC TCGATGTCGG CTCGTCGCAT CCTGGGGCTG TAGTCGGTCC 60 

CAAGGGTTGG GCT6TTCGCC CATTAAAGCG GCACGGGAGC TGGGTTCAGA ACGTCGTGA6 120 

ACAGTTCGGT CCCTATCCGT CGCGGGCGTA GGAAATTTGA GAGGATCTGC TCCTAGTACG 180 

AGAGGACCAG A6TGGACTTA CCGCTGGTGT ACCAGTTGTC TTGCCAAAGG CATCGCTGGG 240 
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TAGCTATGTA 


GGGAAGGGAT 


AAACGCTGAA 


AGCATCTAAG 


TGTGAAACCC 


ACCTCAAGAT 


300 


GAGATTTCCC ATGATTATAT 


ATCAGTAAGA 


GCCCTGAGAG 


ATGATCAGGT 


AGATAGGTTA 


360 


GAAGTGGAAG 


TGTGGCGACA 


CATGTAGCGG 


ACTAATACTA 


ATAGCTCGAG 


GACTTATCCA 


420 


AAGTAACTGA 


GAATATGAAA 


GCGAACGGTT 


TTCTTAAATT 


GAATAGATAT 


TCAATTTTGA 


480 


GTAGGTATTA 


CTCAGAGTTA 


AGTGACGATA 


GCCTAGGAGA 


TACACCTGTA 


CCCATGCCGA 


540 


ACACAGAAGT 


TAAGCCCTAG 


AACGCCGGAA 


GTAGTTGGGG 


GTTGCCCCCT 


GTGAGATAGG 


600 


GAAGTCGCTT 


AGCTTTAATC 


CXSCCATAGCT 


CAGTTGGTAG 


TAGCGCATGA 


CTGTTAATCA 


660 


TGATGTCGTA 


GGTTCGAGTC 


CTACTGOCGG 


AGTAATtGAT 


AAAAGGGaAC ACAGCTGTGT 


720 


TCCTCTTTTT GTATCAATTT 


GTATCACCAA GCATTTTCAT AAGGAAGTCT GTTATTTCTT 


780 


GAGAACTTTC 


TTTTTTTCCA 


TGTGCAATCC AAGTTTGGCA 


GACACCAAAA 


AGTGCATGAG 


840 


TTAGATAGAT 


GCTACTATAT 


TCTAATTCAG TGGTATTTAG 


ATTCAGTTGC 


ATAAATCGCT 


900 


TTTGTAAATC 


TGTACTAAGC 


ATGATATGAA 


GTTTATTTCG 


TAAGAAATTT 


TGGATTTCTT 


960 


TAGTCCCATT 


TTCAGAAAGA 


AGGGCAGCCA 


GAAGTGGTTC 


TGACTCTAGA 


TATTCAAAAA 


1020 


CTTCTAAAAT 


AGCGTCTCTT 


TTGTGATGAG 


CATGTTTTTG 


AAAAATATAT 


TCAAATGTAT 


1080 


GGAATAGCTT 


GCTTTGATAG 


TGCTCAATCA 


TATCATACTT 


ATCCTTATAG 


TGAGTATAGA 


1140 


AGCTGGAACG 


ACTAATTCCG 


GCTTTTTCTA 


CTAATTTGAC 


AGTAGAAATT 


TTATCAAATG 


1200 


GCTGTTCCAT 


CAGTAATTGT 


ACCATAGCAT 


TTTCAATAGT 


TCGCTTTGTT TTTAAGCGTT 


1260 


TGTTACTTTC 


TTGCATATTT 


CCTCCTTGTA AACAAATTAG 


ACTATATGTC TAAAAATAGA 


1320 


TTTTTTATCT 


TGTAATTTAG 


ATTTTTTAAT 


GTATAATCTA 


TTATATCAAA 


ATTTTAGACA 


1380 


ATATGTTT7UV 


AAAAGGAGAA 


ACTAAGTTTA 


AAGAATGGAA 


AGCAATTTAA 


AAAAAACCAA 


1440 


CCTTTATTAT 


TGTCATGATC 


GGGATTTCTC 


TTATTCCAGA 


TCTGTACAAT 


ATCATATTTT 


1500 


TGTCATCAAT 


GTGGGATCCA 


TATGGGCAAT 


TGTCTGACTT 


ACCTGTGGCA 


GTTGTAAATA 


1560 


ATGATAAAGA 


GGCTTCCTAT 


AATGGTAATA 


CTATGGCAAT 


AGGAAAAGAC 


ATGGTGTCCA 


1620 


ATTTAAAAGA 


AAATAAAACC 


TTGGATTTTC 


ATTTTGTAGA 


TGAAGAGGAA 


GGAAAGAAGG 


1680 


GATTGGAAGA TGGCGATTAC 


TATATGGTAG 


TGACTTTACC 


AAGTGATTTA 


TCTGAAAAAA 


1740 


CAACTACATT ATCCAATATT 


CAATCGACAG 


CAGCTTATCA 


ATCATTGACA AGTGAGCAAC 


1800 


AAACT6AGAT 


AAGTGATTCT 


GTATCTCAAA ATTCAACTGA TAGTATTCAA TCGGCTCAGT 


1860 


CAATTGTAGC 


TTTAGTACAA 


GATTTACAGG 


GAAGTTTAGA 


AAACTTACAA 


AATCAATCTT 


1920 


CTAATCTTTC 


GACTTTAAAA 


AATCAATCTA 


ATCAAGTATC 


ACCTATTACT TCTACTTCTT 


1980 


TGATAGGATT 


GTCAAGTGGA 


TTAACAGAGA 


TACAAGGAGA 


TGTTACTAGC 


AAATTAGTTC 


2040 
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CTGCCAGTCA GTCGATTGCA 


TCAGGTGTAA 


ACGCATATAC 


TACAGGTGTT 


GATAAAGTTl* 


2100 


CTCAGGGCGC AAGTCAACTA 


AGTGAAAAAA 


ATGCCACCTT 


GACAGGTAGT 


TTGGATAAAC 


2160 


TAGTTTCAGG CTCAAACACC 


TTGACACAAA 


AATCTTCTAG 


ATTGACAGCA 


GGAGTTGGTT 


2220 


AATTACAATC AGGATCTGGG 


CAATTAGCAG 


ACAAATCCAG 


TCAGTTACTT 


TCAGGTGCTT 


2280 


CTCCATTAGA GAATAGAGCT 


AATAAATTGG 


CAGATGGATC 


TGGGAAACTA 


GCAGAAGGTG 


2340 


GAACAAAGTT AACTTCTGGA 


TTGGAAGATT 


TACAGACAGG 


ACTTGCTTCT 


TTAGGACAAG 


2400 


GACTAGGTAA TGCTAGTGAT 


CAACTCAAAT 


CAGTATCAAC 


AGAATCTAAA 


AATGCAGAGA 


2460 


TTTTGTCAAA TCCACTCAAT 


CTTTCAAAAA 


CAGACAATGA 


TCAAGTTCCT 


GTAAATGGAA 


2520 


TCGCAATAGC TCCTTATATG 


ATATCAGTTG 


CTCTTTTTTT 


GCAGCAATAT 


CAACAAATAT 


2580 


GATATTTGCG AAATTGCCTT 


CAGGACGTCA 


TCCAGAGAGC 


CGTTGGGCTT 


GGTTGAAATC 


2640 


TTGAGCTGAA ATAAATGGTA TTATAGCTGT TTTGGCAGGA ATTTTGGTAT ATGGAGGAGT 


2700 


TCAGCTTATT GGTTTAACTG 


CTAATCATGA 


GATGAGAATA 


TTTATTCTCA 


TCATCCTAAC 


2760 


AAGTTTAGTA TTCATGTCTA TGGTGACCAC 


TTTAGCAACG 


TGGAATAGCC 


GTATAGGAGC 


2820 


TTTTTTCTCA CTTATTTTGC 


TTTTACTACA 


GTTAGCATCA 


AGTGCAGGTA 


CTTATCCACT 


28B0 


TGCTTTGACA AATGATTTCT 


TTAGATCTAT 


TAATCCCTGG 


TTACCAATGA 


GCTATTCAGT 


2940 


TTCGGGATTA CGACAAACAA 


TCTCTATCAA 


CAAGTCATTT 


TCCTAGCTGT 


CATACTAGTT 


3O00 


CTATTTACTA GTTTAGGTAT 


GCTAGCCTAT 


CAACATAAGA 


AAATGGAAGA 


AGATTAAAAA 


3060 


AATCGACCGA TTAACTGGTC 


GATTTTTTAT 


GCCTTAGATG 


ACTTTCGTCT 


GTGATTATAG 


3120 


ATTCCAAATA GTAAGAGAGA 


AGTAAAGGAA 


CAGATTGCTC 


CAGTAATAAA 


ACCATTGGGA 


3180 


ATGAAGGAAA GTGTAATAGT 


TCCTTTCCCC 


TTGGGAATGT 


CAACTTTCAT 


AAATCCAGTT 


3240 


TGAGCTTGTT TAATTTCTAT 


TTTCTTACCA 


TCTTGGTAGG 


CAGACCAACC 


TTTGTCATAA 


3300 


GGAATGGTGA AGAAAATAGA 


TGTATCTTGT 


TGGACATCAT 


ATGTAGCAAA 


AACCTTGTTT 


3360 


TTAGAAGTTG ATACTGTGAC 


AGGTTGTTCT 


TTAATTTTTT 


GAATTGCCTC 


GGTGAAAGTT 


3420 


TTGGTATCTA AACGATAGAA 


GGTAGGAGAT 


TCAAATGATA 


CTTGTGAATT 


TCCAGGGAAA 


3480 


CTAACATTGA TATTGAAAGT 


TTTTTTCTCT 


TTAGTATATC 


CTAGATTAAA 


GAAGGAGAAG 


3540 


ACATTATCAG TTGTAAAAGT 


CTTTTTTTCA 


CCATTTACAA 


GGATGTCAAC 


CTTCTTTTGT 


3600 


TTATCGTTAG AAAAGTTGAAG 


GTTTATGAAA 


GAGAGATAAA 


CTTGGCTGTT 


TTCTGGAACT 


3660 


TCAATTTGAT ACTQGATTGC 


TGCATCTTCA 


TTTGAAGAAC 


TTGTGACACT AATCAAATCA 


3720 


TTAGTATTTT CTATTTTTTC TGTTTTTTCA TAAGGTATTG GAGAAAAATA ATCAAAATTG 


3780 
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ACGTTAGCAA 


GTTGATTTAA AAATGAGGCC TGATTATCCA AGGTATGTTC 


ATTGAACTTG 


3840 


ACATCATTGT 


AAACAGATTG ACTCGCAACT GCAATCGGAA GAGAGTATTG ATTTTCATAT 


3900 


AGGGTAAGAT 


TATCTTTTTG ATAGATATCT TTAAAGCCAT ACTTATCAAT 


AGGACTGTCT 


3960 


GAGATATTGT 


ACTGGATACC AAATAAACTA TCAGCCAAAA TACTATTATT 


TGCATATCGG 


4020 


AGATTGAGAT 


TAGTCCCAGA GGATTTAAAA CCAAGTTTAT CTAAAGTAGA 


GCTTGATGAA 


4080 


CGATTTCGAA 


CAGATGAAAA TTGAGAGATT CCATTGTAGT TGAATTTCAT 


ACTGTCATTT 


4140 


CCTGTCTGAG 


TTTGTAGTTT TTCAGTACGA GTAAATTGAT TTCCAATATA 


TGTTGAGAAA 


4200 


GATTCCATAG 


CTGGGATATC TCGACTATAA GCACTTCGAG AAGCAAATCC 


CCATTCCTTA 


4260 


GCAATTCCGT 


CCATTTGAGA TGAAGCATTT AAACTCATTT CAACCAGTAT 


AAATAAAGAG 


4320 


ATTAGAATGG 


CAAATAGATT CACAGATATA AACTTTTTGA TAACTGCAAG 


GAGTAAAAGA 


4380 


GAATAGACAA 


CCAAAAATTC AAGAGTAAGC AGAATATTCA AATCTGTTAA 


AAAAGAATAA 


4440 


TGCGATTTTA 


GATAGATGGT AGCTAAAAAT CCTGCTACTA CAAGAAAAAG 


CGAAACTAAA 


4500 


AAATTCCAGA 


CTTTAAGTTC TTTCAGACGC TTTAAGACTT CTGCTGCTGT 


GTAAATTAAC 


4560 


AAGGTAGAGA 


AAATCCAAGC ATAGCGATGT AAAAACATGT TTGGAGTATG 


CATGCCTTGC 


4620 


CAAAATAAGT 


CAAGAGCTTC TATGTAAAAG CTTGCAATTA GAAATGCAAA 


GAATATTACA 


4680 


TATATGAGTT 


TCACGTGAAA CTTAATAGAT TTCAGCGTAA AAAATAAAAT 


G6TCAAAATA 


4740 


AAGGGAAATA 


GTCCAACAAA AATCATTGGG ATGGCCCCAT ACTTTGTTGT 


GTCAAAGGAA 


4800 


CCAATGAATT GCTTAGCAAA GAGATCAAGA TACCAGCTAC TTTCAGTTTG AAACTTTGTA 


4860 


ACTTCAGTCA ATTTTTCCCC ATGTGTCTGT AAATCAAATA GAGTGGGAAG AGTCATAATC 


4920 


AAACTAGCCA TACCAGCTAA AAAGGAGATA ACTATGAAAT CAAGAACAGA TGATTTTCGA 


4980 


GTCTTAAAGT 


CCCACGAAAT TTGACAGAGA TACCAGAAAA TAAGAAACAA 


TACTGTCATA 


5040 


TATCCAAAAT 


AATAATTTTG AATAAATAAG ATTGACAGAC TTGTAAAGTA 


CAATAGGAGT 


5100 


TTCTTTTCAG 


TTATCAGTAG ATGTAAACCA GTTATAATTA AAGGAATCAA 


GATAAAAACA 


5160 


TCTAGCCAGG 


TTTTTATCTC TAATTGACTG ACAGTGAAAC TCATCAGAGC 


ATAGGAAGTA 


5220 


GATAAGGCTA 


GTTTTAAAAT CTGAGGGATA GATTGAAACA ATTTATTCAA 


ACTAAAAAAG 


5280 


GTTGACAGAC 


CAATCAATCC AAATTTTAAG AGAGTTGTCA GATAGATAGC 


ATCTGGCATA 


5340 


TTCGTTAGAT 


CAAAAAAGTA AACCAGAGGC GCGAGAAAAC TACCCAAGTA 


ATAACTAGAT 


5400 


AGGGCATAGA 


AGTTTAGCCC TAGACCACTT GTAAAGGTGT AAAACAGATT 


ACTATTTCCA 


5460 


T6TAGGATAT TTCGTAAGGC TACATCAAAA ATAACGTATT GATGAAAGCC ATCTCCTAAT 


5520 


A6AGGAGAGT TGTCGCTATT CCAGTAGATA CTTTGAGATA GATATACTCC AGACATAATC 


5580 
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ACTACAGGAA 


TGATGAAAGA 


AATAAAATAG 


GTTCGATATG 


TTTTTAAAAA TGATTTCATG 


5640 


TTACCTCGTA GAATGATAGA AAACTCAGTT 


GGTTAACCCA 


ACTGAGTTTT GAAOTTTTAT 


5700 


TTAGTCTTTC 


CAAAGTTCTT 


TAACTTTTGC 


TTGTACTTCT 


GCATTTTCTA GGAATTCATC 


5760 


GTAGGTTTCA TCGATACGGT CAATGACGCC ATTTTTAGAT AAGACAATGA TATGGTTAGC 


5820 


CAAAGTTTGA 


ATAAATTCGT 


GGTCATGGCT 


GGCAAAGATG 


ATTGATTCTT TAAAGTTTTT 


5880 


CAATCCATCA 


TTCAAGCTTG 


AGATAGATTC 


CAAGTCCAAG 


TGATTTGTTG GATCATCAAG 


5940 


TACAAGGACA 


TTTGATTTTA 


AGAGCATGAG 


TTTTGAAAGC 


ATGACACGAA CTTTTTCTCC 


6000 


CCCTGACAA6 


ACATTTACAG 


GTTTGTTAAC 


TTCATCTCCA GAGAAGAGCA TACGGCCGAG 


6060 


GAAGCCACGT 


AGGAAAGTAT 


TGTCATCTTC 


TTCTTTACTT 


GCGAATTGAC GCAACCAGTC 


6120 


AAGAATTGAT 


TCTCCTCCTG 


CAAAATCAGC 


TGAGTTATCT 


TTTGGTAGGT AAGATTGACT 


6180 


AGTTGTAACT 


CCCCACTTGA 


CAGTTCCTTC 


ATAGTCAATA 


TCTCCCATGA TTGCACGAAT 


6240 


TAATGCAGTC 


GTTTGAATAT 


CATTTTGTCC 


AATAAGTGCT 


GTCTTATCAT CTGGACGCAA 


6300 


GATGAAACTA 


ATATTATCX^i AGATAGTTTC 


ACCATCAATC 


TTTACAGTTA AATTTTCTAC 


6360 


TGTCAAGAGA 


TCATTACCAA 


TCTCACGTTC 


CGCTTTAAAG 


TTGATAAATG GATATTTACG 


6420 


ACTAGATGGC 


ACAATCTCTT 


CTAGCTCAAT 


CTTATCAAGC 


ATTCTCTTAC GTGATGTTGC 


6480 


CTGCCTTGAC 


TTAGAAGCAT 


TGGCAGAGAA 


ACGAGCAACA 


AATTCTTGCA ATTGTTTAAT 


6540 


TTTTTCTTCT 


GCTTTAGCAT 


TACGGTCTGC 


TAGCAATTTA 


GCAGCAAGCT CAGAAGATTC 


6600 


CTTCCAGAAG 


TCGTAGTTTC 


CGACATAGAG 


TTTGATTTTT 


CCAAAGTCAA GGTCGGCCAT 


6660 


GTGAGTACAA ACi-ri'G'rr'PA AGAAGTGACG 


GTCGTGGGAT 


ACTACGATAA CTGTGTTATC 


6720 


AAAGTCAATC 


AA6AAGTCTT 


CTAACCAAGT 


AATCGATTGG 


ATATCCAAAC CGTTAGTAGG 


6780 


CTCGTCCAAG 


AGAAGAACAT 


CTGGTTTACC 


AAAAAGTGCT 


TTGGCGAGGA GAACCTTTAC 


6840 


TTTTTCACCG 


TTGGCCAATT 


CGCTCATGTT 


TTGGTAGTGT 


AATTCTTCTG GAATGTTTAG 


6900 


GTTTTGAAGT 


AGTT6AGAGG 


CTTCACTCTC 


TGCTTCCCAA 


CCTCCAAGTT CGGCAAACTC 


6960 


TCCTTCGAGT 


TCGGCAGCAC 


GAACCCCGTC 


CTCGTCTGAG 


AAATCTTCCT TCATGTAGAT 


7020 


AGCATCTTTC 


TCTTTCATGA 


TGCTATAAAG 


TTTTTCATTT 


CCCATGATAA CGACATCAAT 


7080 


GGCACGTTCA 


TCTTCGTA6T 


CAAAGTGATT 


TTGACGAAGA ACAGAGAGAC GTTCATCTGG 


7140 


ACCAAGAGAG 


ATGTGACCAG 


TAGTAGGTTC 


GATATCTCCA 


GCTAAAATTT TTAAAAAGGT 


7200 


TGATTTTCCG 


GCACCATTAG 


CACCX3ATTAA 


TCCGTAAGTA 


TTTCCTTCTG TAAATTTGAT 


7260 


ATTGACATCA 


TCAAAAAGTT 


TGCGATCACT 


AAAACGTAGT 


GAAACATCAG ATACTGTAAG 


7320 
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CAATGTTTTT CTCCTATATG TGTAATATAT TTATTCTACT AGAAAATACA GAAATATTCA 7380 

AATTTTTATT TGTCAATTTT GTGTAAATTA TATTTACAGT ATCCTTTACA CAAATCTGTA 7440 

AAAAGCAAGG CTGATTTATT TTGATAAATT ACGGTTATTT CATTAAAAAA ATGCTATAAT 7500 

TGAAAGGACT ATATCGAAGG AGAACAAAAT GACTAAACCC ATTATTTTAA CAGGAGACCG 7560 

TCCAACAGGA AAATTGCATA TTGGACATTA TGTTGGAAGT CTCAAAAATC GAGTATTATT 7620 

ACAGGAAGAG GATAAGTATG ATATGTTTGT GTTCTTGGCT GACCAACAAG CCTTGACAGA 7680 

TCATGCCAAA GATCCTCAAA CCATTCyTAGA GTCTATCGGA AATGTGGCTT TGGATTATCT 7740 

TGCAGTTGGA TTGGATCCAA ATAAGTCAAC TATTTTTATT CAAAGCCAGA TTCCAGAGTT 7800 

GGCTGAGTTG TCTATGTATT ATATGAATCT AGTTTCGTTA GCACGTTTGG AGCGAAATCC 7860 

AACAGTCAAG ACAGAGATTT CTCAGAAAGG ATTTGGAGAA AGCATTCC6A CAGGATTCTT 7920 

GGTCTATCCA ATCXOTCAAG CAGCTGATAT CACAGCTTTC AAGGCTAATT ATGTTCCTGT 7980 

TGGGACAGAT CAGAAACCAA TGATTGAGCA AACTCGTGAA ATTGTTCGTT CTTTTAACAA 8040 

TGCATATAAC TGTGATGTCT TGGTAGAGCC GGAAGGTATT TATCCAGAAA ATGAGAGAGC 8100 

AGGGCGTTTG CCTGGTTTAG ATGGAAATGC TAAAATGTCT AAATCACTAA ATAATGGTAT 8160 

TTATTTAGCT GATGATGCGG ATACTTTGCG TAAAAAAGTA ATGAGTATGT ATACAGATCC 8220 

AGATCATATC CGCGTTGAGG ATCCAGGTAA GATTGAGGGA AATATGGTTT TCCATTATCT 8280 

AGATGTTTTT GGTCGTCCAG AAGATGCTCA AGAAATTGCT GATATGAAAG AACGTTATCA 8340 

ACGAGGTGGT CTTGGTGATG TGAA6ACCAA GCGTTATCTA CTTGAAATAT TAGAACGTGA 8400 

ACTGGGTCCG G 8411 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9064 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIC^: SEQ ID NO: 17: 

TGCCGTACTC AAGTACAGCC TGCGCTAAGT TTCCTAGTTT GCTCTTTGAT TTTCATTGAG 60 

TATTAGTAAC CAAAATCC6A CCACATAGCC AGCCCCTATG AATATAGCCA TTAAAGCTAG 120 

CATGGAATTT AG6AAATTAA AAACCACCGC AGATACAAAG GTTAGCACAA AAACATTAAA 180 

AGCAATGGTG TCAGAAGCCA AGACTAGAAT ATAGGGTGTC AACCGATCTA AAGTTTTGGA 240 

ATCTAGGAAA AATAAGTGTT TATACATGAT GACCTCCTCT ATGGCTGAAA AGCAAGCCTT 300 
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TTGTTTTTTT 


ACCCCAAGAC CCTATGTAGA AAAGTGAGCA AAAACX3GGAA GGTCGCTACA 


360 


ATATTATTGA 


TCACATGCAC 


CGCATAGGAT 


GGATAAATGC 


TCTTGGTATA 


GCGGGTCAAA 


420 


CCAGCAAAGA 


TGATTCCAAC 


TGTTGCAAAG 


ACGAAGATAT 


CTAACAGACT 


AGGCAGGCTT 


480 


GAAAAATGAG 


GGAGAGCAAA 


TAAAATAGAA 


GGAAGAAGCA 


AATCAAGACC 


AAATCGCGAA 


540 


TGCTTAAAGA 


AAGCATGTTG 


CAGTAATCCT 


CTATAAATCA 


ATTCTTCCAT 


CAGTGGAACC 


600 


AGAAAGAACA 


GGGCTATATA 


AATACCTAGC 


TCTGCAAAGT 


TAGTCCCACT 


ATAACCAATC 


660 


AATACAGCCC 


AACCTTCCGC 


AGTTGACTGA 


ACATGTTTAG 


CTGTCTGAAC 


GTTAAAAGAG 


720 


ATCT6GAACA 


CTAGCACTAA 


TACTGTCAAA 


ATCGAATACC 


AAAGCCATTT 


TTTTCTTGGA 


780 


ATGCGGAAGA 


GATAACCATG 


GCCTCTCOTA 


ACAAGAACCA 


CAATCATGAC 


TCXIAATAAAA 


840 


AGTAAACTCA 


AGATATTTTG 


AATCCAGAAT 


AAATTGCCTA 


TCTGAGAAGA 


AAATTGCCAA 


900 


TAGTTTTGGA 


CGATAAGCX3T 


CAGCTGAGAA AGACTAAATA 


CGAAAAATAA 


GTAAGA6AAG 


960 


ACTGCACTTA 


TTTTGAATAG 


AAGTTGATAC 


TTTTTCATAG 


AAATCCTCCC 


TACTATGACC 


1020 


TCACCTTGTC 


AGGCTCTACT 


GCTGTAAGAT 


TAAGAAGACA 


GTTTGTTTTT 


TTTAAGGCTA 


1080 


ACCTGACTAC 


TAGATAATAG 


ATACATTAAG 


GCATTAAAGA 


CAATGAAAAT 


ATGTCCATAG 


1140 


AATAAAATCA 


ACCTCGCATC 


CAAACCAAGA 


TAAAGTTTGA 


TTATCAAAAA 


GATGAGCAAA 


1200 


AGAATTTGAA 


ACCATAAGGT 


TTTTCCAAAA 


ATAAATTTAA 


AGCGATTTCG 


AATATCTACT 


1260 


TCCTTGATTT 


TTACCGCCAC 


CCCTTTATTA 


GCAAGAAGGA 


AAACTCCTGC 


TTCAAACAAA 


1320 


CCACTGTAAA 


GAACAAGCCA 


CCCAATAGAT 


ACGATAGAGA 


TTTGTAAAAA TGTCCCTAAA 


1380 


AGAATATCCA ACACACTACT 


CAA6AAAATA ACAAAAAATA ATCTGTATTT 


CATATTAAAT 


1440 


ACCTCCATTC 


ATTTATTTCA 


CTAACAATTT 


AATAGAGCCT 


TCTACTCAAA TATCCTGTCA 


1500 


GAAAAGGATA 


GAAAGCTACT 


TTTTATAATA 


CTTCAAGCCC 


CACATGAGCA GAAGCGTGAT 


1560 


AAACAAGCAG 


AGAATACACC 


TATATAAGCG 


ATTAGTTGTT 


GATAGAATTC 


TGTTTCTGAA 


162U 


ATACCTCTAT 


ACAAACAAAT 


GACAAACATA 


AAATCTGCCA 


AGCCGATAAA 


CATAAGTTGA 


1680 


TTGGTTCTAG 


GACTAACCAA 


ATCATCATTT 


ACTTATATTT 


AAGAGTATCT 


CTTTTATTTT 


1740 


AATGTATGTT 


AGCACTGAAA 


AGCAAGACAG 


GCCAATAATA 


TTTAAAATGA 


ACAGTAACGG 


1800 


GGTTAAGTCT 


CTAAAAAAAT 


TATCTACTGA 


CACTACAAGA 


AATACTATAC 


ATATTATAGT 


1860 


CGAAACTATC 


TTTTTCTTAT 


CCATAATTAT 


TTACTCCTTT 


CCTAACAAAT 


CCAGCTTATC 


1920 


AATCAAGAGC 


GATTTTTAAC 


ATAATGTAGC 


AGCACCCGTT 


GCAACTTTGA 


CAAGTTTAGT 


1980 


ATATCATTGT 


TTTTTAAAAT 


TTTTCATCCA 


AATCTTGAAT 


TGTCATCGAA 


ACATCTTGAA 


2040 
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TTGTTAAAAA 


ATTTAAAAAG 


TAAGCATTAA 


AAACATACTT 


TCCTCTTTAT 


ATTGTATTGA 


2100 


TACCAACTTG 


TTTGTAGACT 


TTTCATCCTG 


CTATCACATA 


TCATTTTGAC 


AGGCGAAACA 


2160 


ATATTAAAGA 


AACTCCCCTG 


TAAATTAAGC 


TAGCAAATAC 


AGGGGAGAAA 


TTTATTTTTT 


2220 


AGAGAGTACT 


ATCCGTATCC 


TTTTTGGAAG 


ATTTTGAAAA 


TATTTTTCTA 


ATTAAGTCAT 


2280 


CCATATAAGG 


ACCAAATATA 


CCAACTACTA 


AACCAATAAT 


AAAACTTTTA 


AAATCCATAA 


2340 


TTACCACCAA 


CATATTGCTG 


CATAGGCTAC 


ACCTCCAAGT 


ATAGCTCCAC 


CTGCAGCACC 


2400 


AGTTACACCT 


ATTCCTATAG 


CAAATGGTCC 


CAATAGAAAT 


GTCAAACCGT 


TGTTGCACAC 


2460 


CCATCAATTG 


CGCCATATGC 


AACCCCTGCT 


GCACAACTAA 


TTTTTCTTCC 


CCAATCAATA 


2520 


TCTCCACCTT 


CAACGCAAGC 


AAGCATTTCA TTATCCATAA 


CTGCAAATTG 


TGACATCATT 


25B0 


TTTGTATCCA 


TATAGTGTAT 


CACTTTTCAG 


TTACGGAACA 


AGTTTAATAT 


AAAAATTATC 


2640 


AAAAAAACAT 


AGGCAATAAA 


GA6AAAAATT 


AATTTATCAT 


AGATTAGAAA TAATAT6ACA 


2700 


AAACAATTCA 


ATGATGTTAA 


TTCAATAGTC 


TTTTGTTTTT 


TATCQGAGAT ACTTATGGAT 


2760 


AGATAAAT7UV 


GATAGGTTTG 


AAAAGCCAAG 


AGAATAATAA 


AG7VATATAGC 


CTTCATAAAA 


2820 


TTTAGCTTTC 


ATTTTTATGA 


TGTAGCGGTA 


TAGGCTAAAT 


ATCCACAAAC 


CACTGCTCCT 


2880 


CCAATTCCTC 


CTATTGCAGC 


GCCCCATGGT 


CCTAGAAGTC 


TCCCATATTT 


CACTCCACCC 


2940 


GCTGCACAAC 


CTAAA6CAGC 


AACTACAGCT GCTCCTCCGG 


AATTACCTCC 


ATAAACCTCA 


3000 


CTCAGCATTG 


TTTCATTTAT 


ATTACAATAA GTATTCATAC 


AAGTCTCCTT 


TTATTAAAAT 


3060 


CCACCCGTTG 


CCCCTGTTAC 


TCCTGCCCAA AGATCCACAC 


CAAATTTAGC 


TCCTATGTAT 


3120 


CCACATGCTC 


CCATAAATG6 


TGCTCCAACA CCACTCGCAG 


CACAAATAGC TGTCCCTAGC 


3180 


CCCCAGCCAC 


CAAAAGCAGC 


ACCACCACCT 


TCTAAGACAT 


TAGTTTGCCA ATTATTCTTG 


3240 


CCTCCTTCAA 


TACTAGATAA 


CATA6TTATA TCCATTTCAT 


GAAATTGTTC 


CATAATTTTT 


3300 


GTATCCATGA 


CAAATACTCT 


TTTTTATTTT 


TAATTTTTGT 


CTTGTTGTAA 


CTTTGACAAG 


3360 


TTTAGTATAT 


CATCGTTTTT 


TAAAATTTTT 


CATCCAGATT 


TTGAATAGTC 


ATCGAAACGT 


3420 


CTTGAATTGC 


AAAAATTACA 


TTAGACTTCC 


TGCAAAACTA 


GAATCCTAGT 


TCATGATTGA 


3480 


TAATACCAGC 


ACTCAAATTC 


ATTCGTAATC 


CGAAGCGTTT 


ACGATGACTT 


CGATAGGTTG 


3S40 


TTGAAAACAT 


TTTAAACGTT 


TTTACTTTGG 


CAAAGATGTT 


CTCAACCTTG 


CTTCTCTCCT 


3600 


TAGATAGCGC 


ATGGTTACAG 


GCTTTATCTT 


CAACTGTTAG 


CGGTTTGAGT 


TTGCTGGATT 


3660 


TACGTGAAGT 


TTGTGCTTGA 


GGATATATCT 


TCATGAGCCC 


TTGATAACCA 


CTGTCAGCCA 


3720 


AGATTTTACC 


AGCTTGTCCG 


ATATTTCTGC GACTCATTTT GAACAACTTC ATATCATGAC 


3780 


AATAGTTCAC 


AGTGATATCC 


AAAGAAACAA TTCTCCCTTG 


ACTTGTGACA ATCGCTTGAG 


3840 
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TCTTCATAGC GTGAAATTTC TTTTTACCAG AATCATTCGC TAATTCTTTT TTTAGGGCGA 3900 

TTGATTTTTA CTTCCGTCGC ATCAATCATT ACCGTGTCCT CAGAACTGAG AGGAGTTCTT 3960 

GAAATCGTAA CACCACTTTG AACAAGAGTT ACTTCAACCC ATTGGCTCCG ACGGAGTAAG 4020 

TTGCTTTCGT GAACACCAAA ATCAGCCGCA ATTTCTTCAT AAGTGCGGTA TTCTCGCACA 4080 

TATTGAAGAG TGGCCATAAG AAGGTCTTCT AGGCTTAATT TAGGTTTTCG TCCACCTTTT 4140 

GCGTGTTTAA GTTGATAAGC TGTTTTTAAT ACAGCTAGCA TCTCTTCAAA AGTCGTGCGC 4200 

TGAACACCAA CAAGACGCTT AAATCGTGCA TCAGTTAGTT GTTTACTTGC TTCATAATTC 4260 

ATAGAACTAT AGTAAAATGA AATAAGAACA GGATAAATCG ATCAGGACAG TCAAATCGAT 4320 

TTCTAACAAT GTTTTAGAAG TAGAGGCGTA CTATTCTAGT TTCAATCTAC TATACTATAC 4380 

CATATTTTGT TTCGCAGGGA ATCTATTATA AAAGGGTAAG TATTGCAAAA ACACTTACCX: 4440 

TTTTCTTTTA TACTTCATTA AGCTCTACTT TTTATAATAC TTCAAGCCCC ACATGAGCAG 4500 

AAGCATGATG ATTAAGCAGA GAACAGCGCC AATATAAGCG ATTATTTGTT GGTAGGATTC 4560 

TCCTGCT6TG ATACCTCTAT ACAAACAAAT AATAGACATA AAACCTGTCA AGCCGAT6AA 4620 

CATAAGTTGA TTGGTTCTAG GACTAACCAA ATCATCATCT TCAAACTCTC TTATCCTCAT 4680 

TTCCCTAGTG AGATAAACAG TAACCAAAAT AGAAGCXAAG TTAATAACTA CTAAAAGAAA 4740 

TTGGAAAACT ACGGAAAAAT TTAAAAACTG ACGAGATAGA AATAGATAAG TAGAAACAAG 4800 

CAAGGGCAAC TGACCTAAGA ACAATCTCGC AAGGAAGATG TTCCGTTTTT TAGCAAGAAA 4860 

AGTTTTCATT TCTTTTCTCC TTTCTTTTTA TTGATAGCAA AATAGATCAT AACTGCAATC 4920 

ACATAGGCTA TGGTATAAAA TAGCTGATAC CAA6CACTCT CCCTAAGCGG ATATA6AAAG 4980 

ATGGACATGA TTAGATACAG AACXSAAAATA ATCAGTATTT TTTTCTTCAT AAGATTTCCT 5040 

CCTAAATGTG CGATTTATCT TAGTTGAGCA AGAACATTTA CACTGCTAGT ATAGCACTTA 5100 

TTTTGACCTT GGATCACTCA AATCATAAAT GGTCATCAAA ACCTCTTGAA TTGTAAAAAT 5160 

TAAAAAAGCA AGCATGAAAA ACATACTTTC CTCTTTATAT TGTATTGATA CCAACTTGTT 5220 

TGTAGACTTT TCATCCTGCT ATCACATATC ATTTTGACAG GCGAAACAAT ATTAAAGAAA 5280 

CTCCCCTGTA AATTAAGCTA GCAAATACAG GGGAGAAATT TATTTTTTAG AGAGTACTAT 5340 

CCGTATCCTT TTTGGAAGAT TTTGAAAATA TTTTTCTAAT TAAGTCATCC ATATAAGGAC 5400 

CAAATATACC AACTACTAAA CCAATAATAA AACTTTTAAA ATCCATAATT ACCACCAACA 5460 

TGTTGCTGCA TAGGCTACAC CTCCAAGTAT AGCTCCACCC GCAGCACCAG TTGCTGCACC 5520 

TTGCCATGTT CCTGTTTTAA TGCCTAGTTG AAGACCTCTT GCTGCTCCTC CTCCAACACC 5580 
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TGCTTTGGCA 


AAATCTCCCC 


AATTGCATCC GCCACCTTCA 


ACGCAAGCAA 


GCATTTCAGT 


5640 


ATCCATAACA GAAAATTGTG 


ACATCATTTT 


TGTATCCATG 


ACAAATACTC 


CTTTTTTAAA 


5700 


AAACTAAAAT 


AAATCAGAAT 


AGAATCCTCA TAATTTTACT 


ATAAGTCTTA 


CCAACTTAGT 


5760 


CCCAATTTAT 


CACCAACCAT 


ACCTCCTAAG 


CATGTTAATC 


CACCCCCAAT 


TGCACCAATG 


5820 


TGTGCTCCAA 


CAAATGCACC 


AGCAAGTCCA 


GCTACTCCTA 


AAGTGGCCAA 


ACCTGCTCCA 


5880 


GTTCCACCAG 


TTATAATTCC 


CGTAGTGACT 


CCTGTAATCA 


GTGCATTTTG 


ACAATCAGTG 


5940 


GAGCTATACC 


CCCCTTCAAC 


TTTCGCAAGC 


ATTTCAGTAT 


CCATAACCTC 


TAACTGTGAC 


6000 


AACATTTTTG 


TATTCATGAT 


GAATACCTCC 


TTTTTATTTT 


CAATTTGTTA 


CCAAAGTCTT 


6060 


AAATTCAATA 


AACAAATAGA 


TTTTTTATAG 


TATCTTTTTG 


ATTTTCTTAA 


AAAAGTATAT 


6120 


ACGTCTACTA 


TCTTCTTAAA 


GGTAGCAGTA 


CCTATTTTTT 


AGTCTAAGAT 


TTCAATAATC 


6180 


TTGAGTATCT 


AAAATATCTT 


AATTTOGTTA 


TTCTCCTTGC 


AATAAAAAGT 


TTTACTATAC 


6240 


TATTTATTAA 


CTTGCAGAAA 


GCAAAAAATA TTAGTAAATA 


ATAGTTTATA 


GTTAAGTTTT 


6300 


TTATTCCTAC 


CAATCCATCA 


ACTAAGTAAA 


GCATCAACGA 


TTACATAAAC 


GATTGATAAT 


6360 


ATAATTAAAA 


TTTTGCTAAC 


TATCTTATTC 


TCATCATTCT 


TAGATAACTT 


TGATATTTTG 


6420 


TAAGTAAGTA AATAAGACAG 


TAAATTAATA 


GCGATAATAA 


TACTATATTT 


AAGAATCATA 


6480 


ATCTTACAAA GAGGACATAA 


TTCCTGAACC 


TACACAAATA 


AGTGTTGCTG 


CTCCCCCAGT 


6540 


TATCGGACCA GTCGCAGCAG 


CTAATAGTAC 


TGCTCCAATA 


CAACCACCGA 


TTGCAGATCC 


6600 


TAAATTGCCT CTTCCTCCAC 


TAACTATTTC GAGTTCTTCA 


TTATCCATAA 


CAGAAAATTG 


5660 


TTCCATCATT 


TTTGTATTCA 


TGACAAATAC 


TCCTTTTTTC 




TTTGTCTTGT 


6720 


TGTAACTTTG 


ATAAGTTTAG 


TATATCATCG TTTTTTAAAA 


TTTTTCATCC 


AGATCTTGAA 


6780 


TTGTCATCGA 


AACGTCTTGA 


ATTAGCTTTT 


TTATTTCAAG 


CCACCTCTAA ATGTTTAAAA 


6840 


AAAATAATTT 


CTAATCACTT 


TTTTACCATT 


CAGGAAGTTT 


TAATGACTAT 


TCAAGATTTC 


6900 


ATAAAATATG AACTTAGTTT 


TATGACATAA 


TAGACCTATC 


CACTATATGA 


AAGGAATTGC 


6960 


CAATGACTTC 


TTATAAACGT 


ACATTTGTTC 


CTCAAATAGA 


TGCGAGAGAC 


TGTGGTGTCG 


7020 


CTGCCTTAGC 


CTCGATTGCT 


AAATTCTATG 


GTTCAGATTT 


TTCTCTAGCT 


CACTTGAGAG 


7080 


AACTTGCAAA 


GACCAATAAA 


GAAGGGACGA 


CTGCTCTTGG 


CATTGTAAAA 


GCCGCTGATG 


7140 


AAATGGGCTT TGAAACAAGA 


CCTGTTCAAG 


CAGATAAAAC 


GCTCTTTGAC 


ATGAGTGATG 


7200 


TCCCCTATCC ATTTATCGTT 


CACGTTAACA AAGAAGGAAA 


ACTCCAACAT 


TACTATGTTG 


7260 


TCTATCAAAC AAAGAAAGAC 


TATCTGATTA TTGGTGATCC 


TGACCCTTCT GTAAAAATCA 


7320 


CTAAAATGTC 


AAAAGAACGC 


TTTTTCTATG AATGGACTGG 


AGTAGCTATT 


TTTCTAGCTA 


7380 
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CCAAACCCAG 


CTATCAACCC 


CATAAAGATA AAAAGAATGG TCTACTAAGC 


AAGCTTCCTT 


7440 


CCTCTGATTT 


TCAAACAAAA ATCTCTCATT GCTTACATTG TTCTCTCAAG CTTATTGGTC 


7500 


ACTATTATCA ATATAGGTGG 


TTCTTACTAT CTCCAAGGAA TCTTGGATGA 


ATACATTCCA 


7560 


AATCAGAT6A AATCAACTTT AGGAATCATC TCAGTTGGTC TGGTTATCAC CTATATCCTC 


7620 


CAACAAGTCA 


TGAGCTTCTC 


CAGAGATTAT CTCCTAACCG TTCTGAGTCA 


GAGATTAAGT 


7680 


ATTGATGTGA 


TTTTATCCTA 


TATTCGCCAT ATTTTTGAAC TTCCCATGTC 


TTTCTTTGCG 


7740 


ACACGTCGTA 


CAGGAGAAAT 


CATTTCACGA TTCACAGATG CTAACTCTAT 


TATAGATGCC 


7800 


TTGGCTTCTA 


CCATTCTTTC 


TCTTTTTCTG GATGTTTCTA TTCTGATTCT 


TGTAGGAGGC 


7860 


GTCTTACTGG 


CACAAAACCC 


TAATCTCTTC CTTCTTTCTC TTATTTCCAT 


TCCTATATAC 


7920 


ATGTTCATCA 


TCTTTTCTTT 


TATGAAACCT TTCGAAAAAA TGAACCAT6A 


TGTCATGCAA 


7980 


AGTAATTCTA 


TGGTTAGCTC 


TGCCATTATC GAAGATATCA ACGGGATTGA 


AACTATAAAG 


8040 


TCGCTCACGA 


GTGAAGAAAA 


TCGCTATCAA AATATAGACA GCGAATTTGT 


AGATTATTTG 


8100 


GAAAAATCCT 


TTAAGCTCAG 


TAAATATTCT ATTTTACAAA CGAGTTTAAA 


GCAGGGAACA 


8160 


AAATTAGTTC 


TGAATATCCT 


TATCCTATGG TTTGGCGCTC AATTAGTCAT 


GTCAAGTAAA 


8220 


ATTTCTATCG 


GTCAGCTGAT 


TACCTTTAAC ACACTTTTTT CTTACTTTAC 


AACTCCTATG 


8280 


GAAAATATTA 


TCAACCTCCA 


AACCAAACTC CAATCTGCGA AGGTCGCTAA 


TAACCGTTTG 


8340 


AACGAAGTCT 


ATCTAGTCGA 


ATCTGAATTT CAAGTTCAAG AAAACCCTGT 


TCATTCACAT 


8400 


TTTTTGATGG 


GCGATATTGA 


ATTTGATGAC CTTTCTTATA AGTATGGTTT 


TGGATGAGAT 


8460 


ACCTTAACAG 


ATATTAATCT 


CACGATTAAA CAAGGAGATA AGGTTAGCCT 


AGTTGGAGTT 


8520 


AGTGGTTCTG 


GTAAAACAAC 


TTTAGCCAAA ATGATTGTCA ATTTCTTTGA 


ACCCTACAAA 


8580 


GGGCATATTT 


CCATCAATCA 


TCAGGATATT AAAAACATTG ATAAAAAAGT 


CTTGCGCXrCT 


8640 


CATATTAATT 


ACCTACCCCA 


ACAAGCCTAT ATCTTTAATG GCTCTATTTT 


GGAAAACTTA 


8700 


ACCTTGGGCG 


GTAATCATAT 


GATTAGTCAA GAAGATATTC TAAAAGCTTG 


TGAAGTAGCT 


8760 


GAAATCCGTC 


AAGACATTGA 


AAGAATGCCT ATGGGCTATC AAACTCAGCT 


CTCTGATGGA 


8820 


GCTGGTCTAT 


CAGGAGGACA 


GAAGCAACGA ATCGCTCTCG CTCGTGCTCT 


TTTAACTAAA 


8880 


TCTCCTGTTT 


TAATACTAGA 


TGAAGCTACT AGCGGTCTTG ATGTCTTGAC 


TGAGAAAAAG 


8940 


GTTATAGATA ATCTTATGTC 


TCTAACTGAT AAAACCATTC TCTTTGTAGC 


CCATCGTCTC 


9000 


AGTATAGCCG AACGAACCAA 


CCGTGTCATT GTTCTTGACC AGGGGAAAAT 


CATTGAAGTT 


9060 


GGTA 








9064 
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(2) INFORMATION FOR SEQ ID NO: 18: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7780 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOCSY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CTCCATTTTT TTGATTTCAT AAATAAACAA CCTCTCTGTT AATTTTGTAT AATTATAACG 60 

ATATCCAAGT TACTTGTCAA GTGTTTTTTA AATTTTTATC TCAAAAATAT TTTTTCGTTC 120 

AAAAAAAGGA GCCATCAGTT GATTTCAAGC TCCCTTTTAT ACAGAATTAA ACTATTTTAT 180 

AGTTCGACAA TCTTACCTGT TTCAAAGTAG ACAACCCATT CACAGATATT TTTAGCATAG 240 

TCACCGATAC GCTCCAAGTA GGAAATAACT TGG/^TAAT CACGACCCGT AACAATGGCT 300 

TCTGGATTTT TCTTAATCTC TTCAGTCGCA AGGTCACGGA TAGTTTCAAA ATAGTGGTTA 360 

ATTTGCTCAT CCATGGAGGC CACCCGGTAT GCGTCGTCAA CAGAACCATT AAGATAAAGA 420 

TCAAGTGCTG CTTCCACAAC GCTTTTAACT TCACGTCCCA TTTTTTTAAT TTCTTCCTCT 480 

ACA6CTGGAA TGCGCTCTTC CCCCTTCATA CGGATGGTTG CCTGGGCAAT GGCTACAGCG 540 

TGATCCCCCA TACGCTCCAC ATCTGATACA GCCTTAAGGA CAGTCAAGAC TGTACGCAAA 600 

TCTTGAGAGA CTGGTTGTTG GAGTGCGATC ATTTCAAATG ATTTCTTTTC CAGTTTCACT 660 

TCGTATTCAT TTACTTCTGC ATCATCTTCG ATGACCTCTT TTGCCAGGTC ACGGTCATGC 720 

GTGACAAAAG CACGTACCGT ACGATTGATT TGTGAGAGCA CTTCTTGTCC CATAGCGTAG 780 

AACTGGTTAT GTAATTTCTC TAAATCTTCT TCAAATTGAG ATCGTAACAT CTTTCATCTC 840 

CTTATCCAAA TTTTCCTGTA ATATAGTCTT CCGTTTCCTT GTGTTGGGGA TCAAGGAACA 900 

TCTGCTTGGT ATCATTAAAT TCAATCAAAT CTCCATCTAG GAAAAATCCT GTCTTATCAG 960 

AGATACGTGA AGCTTGCTGC ATGGAACGGG TTACCAGAAG CATGGTGTAC TTGTCTTTTA 1020 

GACCATACAA GGTTTCCTCA ATTTTACCAG CTGAAATCGG ATCCAAAGCC GAAGTTGGCT 1080 

CATCCAAGAG GAT6ATTTTA GGACTAGTTG CCAAGACACG GGCCACGCAG ACACGCTGCT 1140 

GTTGACCACC TGACAATCCA ATAGCTGAAT CATATAGACG ATCCTTGACC TCATCCCAGA 1200 

TA6AGGCACC TTGCAAQGCT TTTTCTACGG CTTCATCCAG AACCTGCTTA TCCTTAATTC 1260 

CATTGATACG AAGCCC6TAG ACAACATTCT CATAGATAGT CATAGGGAAA GGATTAGGTT 1320 

GTTGGAAAAC CATTCCGATT TCCTTACGTA ATTCAACCGT ATCTGTACGC GGACTGTAGA 1380 

TGTTGTGACC ATTGTACACC ACGGATCCAG TT6TGGTCAC CTCTGGATTO AGATCTCCCA 1440 
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TGCGGTTGAG AGACTTGAGG AGGGTTGACT TCCCTGATCC AGATGGACCA ATCAAGGCTG 1500 

TAATTTCCTT AGGTTGGAAA GATAGGGAAA CACTATTCAA AGCCTTCTTT TTATTATAAT 1560 

AAACGGACAG GTCTGATACC TGTAAAATCG CATCTGTCAT ACGGTTTCCT TTCTAACCAA 1620 

AGTGACCAGA TACATAGTCA TTGGTGGACT GTAGCTTGGC ATTTTGGAAA ATAGTTGCAG 1680 

TCTTGTCATA CTCAATCAAA TCACCCAAGT AAAAGAAGCC TGTATAGTCA CTTGCACGAG 1740 

CAGCCTGCTG CATATTATGC GTTACAATGA TGATGGTAAA GTTTTTCTTG AGCTCAAACA 1800 

TGGTCTCTTC TAGTTGCATG GTCGCAATCG GATCCAAGGC TGAGGCTGGC TCATCCATTA 1860 

AGAGGATATC TGGCTTAACA GAGATGGCAC GAGCGATACA GAGACGTTGT TGCTGACCAC 1920 

CTGATAAGGT CAAGGCTGAC TTGTGGAGAT CGTCTTTAAC CTGATCCCAG AGGGCAGCCT 1980 

GACGAAGGGA GGTTTCTACG ATTTCATCTA GGACTTGCTT ATCCTTAACT CCAGCACGTT 2040 

CATGCGCAAA GGTAATATTA CGGTAAATTG ACTTAGCAAA TGGATTGGGA CGTTGAAAAA 2100 

CCATTCCAAT GTGTTTACGC ATTTCATAAA CGTTGATTTC TGGACGGTTG ACATCAATTC 2160 

CACGATAGAG AATCTGCCCA GTTACTTTAG CAATATCAAT AGTATCATTC ATGCGATTGA 2220 

GACTGCGTAA GTAOGTAGAT TTCCCCGATC CCGACGGGCC AATCAAAGCT GTAATTTTAT 2280 

TTCTTTCAAA TTGCATATCA ATCCCCTTAA TGGATTCATT TTTACCATAG TAAACATGGA 2340 

CATCCTTAGT AGAAAGGGCT ACTTTTTCTT CAGGAAAGGT AAGGATATGC TTCTCATCCC 2400 

AGTTATATGT TGACATGGCT TCTCCTTTAG GCAGCX3GTTA ATTTCTTGTG TAGATAGCTT 2460 

CCGAACTTAC QAGCTCCAAA GTTAAAAATC AGGATAAAGA TCAGGAGCAC AGCGGCAGAA 2520 

CCTGCTGATA CAATGGTTCC ATCTGGAATA GTGCCTTCAC TATTGACTTT CCAGATATGG 2580 

ACAGCCAAGG TTTCTGCTTG ACGGAAGATA GAGATGGGGC TAGTCACACT GAGGATATTC 2640 

CAGTTAGACC AGTCAAGAGC TGGCGCCGAT TGCCCTGCTG TATAGATCAG AGCTGCAGCT 2700 

TCGCCAAAGA TACGACCA6A TGCCAAGACG ACACCCGTTA CAATACCTGG AAGCGCTTCC 2760 

GGAATAACAA CATGAACCAC TGTCTCCCAG CGAGAAATCC CAAGAGCCAG ACCAGCCTCA 2820 

CGTTGGGTAT GGTGAACGTG TTTCAAACTA TCCTCTACAT TACGCGTCAT CTGAGGCAAG 2880 

TTAAAGACTG TCAAGGCCAA GGCACCTGAA ATGATTGAAA ATCCATACTC AAACTGGACT 2940 

ACAAAGATCA AGTAACCAAA GAGACCCACC ACCACTGATG GTAAAGAGGA CAAAATTTCA 3000 

ATACAAGTCC GCACAAAGTT GGTAACAGGA CCTTTTTTAG CATATTCAGC CAAGTAAATC 3060 

CCAGCTCCCA TAGAAAGAGG TACAGAAATA ATCAAGGTAA TGACCAATAG GAAAAAGGAA 3120 

TTGTAAAGCT GAATGCCAAT CCCACCACCT GCTTGAAAAG CAGAAGACCT TCCAGTCAAG 3180 
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AAAGACCAAG 


AGATATGGGG 


CAAGCCCCGA 


ACCAAGATAT 


AGAGAATCAA 


GGAAGCCAAG 


3240 


ATTGTCACAA 


TGAT6CTAGC AATCGTATAG AGGACAGCTG 


TTGCAAGTTT 


ATCTAATTTC 


3300 


TTAGCGCGCA 


TAATTTTTCT 


TTCCTCTTTC 


TTTCGTAATC 


AATTTAATCA 


CACTGTTAAA 


3360 


AACTAAGCTC 


ATCAAGAGCA 


GTACCAAGGC 


CAGTGACCAG 


AGAACATTAT 


TATTTACAGT 


3420 


TCCCATGACA 


GTGTTCCCAA 


TTCCCATAGT 


TAATATAGAA 


GTTAAAGTTG 


CAGCTGGTGT 


3480 


GGTCAAGGAA 


GTTGGGATAA 


CAGCTGAGTT 


TCCGACAACC 


ATCTGGATAG 


CTAGAGCCTC 


3540 


ACCAAAGGCA 


CXSCGCCATCC 


CAAAGACCAC 


TGCAGTGAAA 


ATACCAGAAC 


GGGCCGCCTT 


3600 


CAAGATCACA 


CGCCAGATA6 


TCTGCCAGCG 


AGTGGCTCCC 


ATAGCGAAAC 


TGGCTTCACG 


3660 


ATAATAACGA 


GGAACCGCAC 


GCAAGCTATC 


CGTTGTCATA 


AAGGTTACGG 


TCGGCAAAAT 


3720 


CATGACAAAG 


AGGACGGAAA 


TCCCTGACAA 


AATCCCAAAA 


CCAGTCCCAC 


CAAAGACACT 


3780 


GCGAACAAAG 


GGAACGACGA 


CTTGCAAGCC AATAAATCCG 


TACACTACTG 


AAGGAATCCC 


3840 


AACCAGGAGT 


TCAATAGCTG 


GTTGCAAAAT 


CTTCGCCCCT 


TTTGGTGATA 


CTTCGGTCAT 


3900 


AAAAACTGCT 


GCACCAATAG 


CAAAGGGTGT 


TGCGATAAGG 


GCTGAGAGAA 


TGGTAACGAT 


3960 


AAAGGAACCC 


AAAATCATAG 


GAAGGGCACC 


AAATTCTTTA 


CTAGAAGGAT 


TCCAAGTTCC 


4020 


TCCCAAAAGA 


AAGTCAAAGA 


TATTCACACC 


ATTGACAAAG 


AAGGTCGACA 


AGCCTTTTTG 


4080 


CGCTACGAAA 


ACCAAAATCA TGGCCACAAG GATGACTATC 


AAAGAAAGAC 


AGGCAAAGGT 


4140 


CAAACCTTTT 


CCTAATTTCT 


CCAGACGAGA ATTCTTTGAT 


GGAAGCAACA 


TTTTCTTAGC 


4200 


TAATTCTTCT 


TGATTCATTA 


TTGTCTCCCT 


TCCAACACTG 


TCACAGTTCC 


GGCAGCATCT 


4260 


TTTTCAACCT 


TCATTTCCTT AATCGGAATA TACTTCAATC 


CTTTGACAAT 


CCCTTCTTGG 


4320 


OTCTCATCCG 


AGAGAACAAA ATTGAGAAAT 


TCTGCAGCCA 


ACTCATTGGG 


CTGCCCCAAT 


4380 


GTATACATAT 


GCTCATAAGA 


CCACAAGGGC 


CAATTATTGC 


TACTTATATT TTCTGGACTT 


4440 


AAGTCATAGC 


CATTCAACTT 


CATGCTTTTG 


ACCGAATCAT 


CTATATAGGT 


AAGAGATAAA 


4500 


TAAGAGATAG 


CTCCTGGACT 


TTTTGATACG 


ATTGATTTTA 


CCGCTCCATT 


TGAATCCTGC 


4560 


TCCTGACTTT 


GCATGGCAGA 


CTGACCTTCC 


ATAATGACAG 


TATCAAAGGT 


AGCACGAGAG 


4620 


CCAGAGCCGG 


CTGCCCGATT 


GATAACAGAG 


ATGGGTAAGT 


CCTTACCACC 


AACCTCTTTC 


4680 


CAATTGGTTA 


CCTCACCTAT 


GAAGATTTGA 


CGAAGTTGCT 


CTGTCGTTAG 


GTTATCAACA 


4740 


TCAACCTCCT 


TATTGACAAT 


CAGAGCCAAG 


CCAGCTACCG 


CGACCTTGTG 


GTCAACAAGA 


4800 


GCAGAAGCAT 


CAATTCCGTC 


TTTTTCXITCA 


GCAAATACAT 


CTGAGTTTCC 


TATATCAACT 


4860 


GCCCCAGACT 


GAACCTGGGA 


CAAGCCTGTA 


CCAGAACCTC 


CCCCTTGGAC 


ATTGACCGTT 


4920 


TTTCCAACAT 


GGATCGTGCC 


AAATTCATCT 


GCCGCTACTP 


CAACCAAGGG. TTGCAAG6CA 


4980 
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GTTGAGCCAA 


CAGCCGTTAT 


GGATTCTCCA 


CGATCAATCC 


AGCTAGCACA 


GCCTACTAAA 


5040 


CAAGCCGTCA GCCAAAAAGC 


GATAAGAGAC 


AGAGCAAGCT 


TTTTTCTTTT 


TTTCACTGTT 


5100 


TTTCTCCTCG 


AAAATAATTA 


TGAATACTGT 


GAATTTTTTA AGTAGTTCTT 


TATGAGTTGA 


5160 


CGCATGAATT 


CTTACCAAAT 


TTCTGCGCAA 


TTGATTATTT 


ATATAATATA 


GGCTATATTA 


5220 


CTCTTTCCTA 


ACCTCCTTTT 


TTCATATGTG 


GATAAAATCT 


CTTGTCTATC 


CCTTCCCCCA 


5280 


TTGTCACCCA 


TTATAGTCAT 


TTCGTGTCTC 


TTTTTCCCCT 


TTTTAATGCA 


AGGGAAATTA 


5340 


CTCTCCTTAG 


ATGATAATCC 


AAAAGCTAGA 


AAGGTATCTC 


AAACCTCTCT 


ACTCTCCCAG 


5400 


ACTAGTTTAC 


AACTAAAAGG 


AAAAGATTCT 


ATTTTATGAG 


AAATCTAGTT 


TACAAGCGGT 


5460 


AAGAACGCTA 


ATAACTAAAC 


TTCTTGTACT 


CTTTGAAAAT 


CTCTTCAAAC 


CAGTGTTTTG 


5520 


AGCTATCTAT 


GGCTAGCTTC 


CTAGTTTGCT 


CTTTGATTTT 


CATTGAGTAG 


TAAAACTACA 


5580 


TGTAATGGCA ATCAAGATAT CAAGAATCAT 


CCTACTAAAA 


AAATCCATAC 


TTTCACTATA 


5640 


ACATAGAATA 


AGATATTTGA 


CTAGCATTTT 


CATTTGAATC 


TGAGGCCTTT 


TGGAAAATAA 


5700 


TTTTTCAAAA 


CATTTCCAGT 


AACCTTTGCA 


AAGCCCAAGC 


CATTGCCTTT 


AACCAAAACT 


5760 


TGGTACCAAC 


CATTTGGCAG 


ACTTTCTGCC 


AGCTGAACGG 


TTTCTCCAGC 


CGCATACTTG 


5820 


ACAAACGCTT 


CTTGGCCAAT 


TTCAACCGAC 


TGTTCGACCT 


GACTCGGTTT 


CAAGGCTAAA 


5880 


CCAAGAGCGA 


AACTGGGCTC 


AAAGCGTTTC 


TTCTTAAAAG 


TACCCAGATG 


CAGTCCATTG 


5940 


CGAGCAATCT 


TGAGCTTCCA 


TAAATCTGGC 


AAAAGTTCTG 


GCAAGAGATA 


AAGCTGGTCT 


6000 


CCAAAAATCT 


GCAAGATACC 


CGGTAGATTG 


ACCTTCAAAT 


GGTTTTGGGC 


AAATTCCTGC 


6060 


CACAA6GCAA 


CTTGTTCACG 


GCTGAGGTTA 


CTCTTACTTG 


CCTTAAATTT 


AGGAGCTGGA 


6120 


TTGTTACCCT 


TAAACTGTAG 


ATGGGCAACA 


AACTGACCCT 


CTCCCTTAAA 


CTGATGAGGA 


6180 


TACATCCGA6 


CCGTTTCTGG 


CAG6TCAATA 


CCAGCTACCA 


TTCCATTGAT 


ATGCTCTACT 


6240 


GGCAACAAGT 


CAAAATCATA 


CTCTTCCAGC 


AACCAATTGA 


CAATCTCTTC 


GTTTTCCTCG 


6300 


GGTGCCCAGG TACAGGTCGA ATAAACCAGA 


TGACCACCTT 


CAGCTAACAT 


GGTCACTGCA 


6360 


TCCTCCAGAA 


TTTCTCTTTG 


CAAGCTAGCA 


CATTGACTCG 


GATAATCTAA 


GCTCCAATAG 


6420 


TCCATAGCAT 


CAGGTTGCTT 


ACGAAACATT 


CCTTCACCAG 


AGCAAGGGGC 


ATCAAGAACG 


6480 


ATTAAGTCAA 


AATAGCCTTT 


AAAGACCTTG 


ACCAAGCGGT 


CGGCAGATTC 


ATTGGTCACC 


6540 


ACGACATTOXS 


TCGCTCCAAA 


ACGCTCCATG 


TTTTCAACCA 


AAATCTTAGC 


CCGTTTGCTT 


6600 


GAAATTTCAT 


TGGAAnCAAG 


TAGCCCCTCC 


CCTGCTAGAT 


AGGCTGCCAG 


TTGAGTTGAT 


6660 


TTGCCCCCCG 


GTGCAGCAGC 


CAAGTCCAAG 


ACCTTCATAC 


CAGGACTGGG 


TTGGGCTACT 


6720 
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TGAGCCACCA 


TTTGAGCAGC 


AGGTTCTTGC 


GAATAAACTA 


AACCTGTAGC 


ATGCTCAGGC 


6780 


GATTTCCCTG 


AAACCTTCCC 


ATAGTGGCCC 


CAAGGGGTTT 


GAGTAATGGC ATCAGAAAAG 


6840 


GAAAGTTGCT 


CTTCTTTTAA 


GGGATTGACC 


CGAAAGGCCG 


AAACCGCTTC 


CTCCTCAAAA 


6900 


GAGGCAAGAA 


AATCTCTTGC 


CTCATCTCCT 


AGTATCTCTT 


TATATTTTTC 


AACAAATCCT 


696a 


TCTGGAAATT 


GCATTTAAGT 


TCTTTTCCTT 


TCGTAAATAT 


AGGACTGAAT 


TTCCTCCTGC 


7020 


ATCTCAAGAG 


GCACCATCAT 


GACCGGCTGT 


CTGGTTTGAA 


AATCAGGAGC 


TTCACCAAAA 


7080 


AGGGTCACAA 


CCCGATAGCC 


CAGACTTTCC 


CCTAAAATAC 


TAGCTGCGGC 


ATAATCCCAT 


7140 


GGTTGCAGAT 


AAGTGAGATA 


GGTCAACAAA 


CGCCCTGACA 


AAATCTTGGC AAAACTAATG 


7200 


GCCGCACTTC 


CATAGACACG 


AACACCAAGA 


ACCGCTCGGC 


TCAAATCAGC 


CAGCCCCCAT 


7260 


TCATTGGTTT 


CCAGCATACC 


ACTATTCCCT 


GCAATGAGAA 


AATCTCCAAG 


TGGTTTAGTT 


7320 


TTAAAAGGAG 


CTAGGGACCT ATCATTTAGA CAAACTGGAA 


ATTCCCCACC 


ACCGT6GTAA 


7380 


CAATCCCCTT 


TGACCACATC ATAAATCAGA 


CCAAACTGTC 


CCTGACCATT 


TTCAAAATAA 


7440 


GCCATCATAA 


CAGCAAAATC 


TTCCTGCTGG 


GCTACAAAAT 


TATTGGTACC 


ATCAATGGGA 


7500 


TCAATGACCC 


AAACCTTGCC 


CTCTTGAACC 


GAGGCTCGCA 


GACAACCTTC 


TTCAGCACAA 


7560 


ATCTTATCCT 


CAGGATAACG 


GGACAAAATC 


TCACCAACCA 


AGAGTTCCTG 


AACTTCTTTG 


7620 


TCCAGTCTG6 


TCACCAAATC 


TGTTGGAGAG 


GACTTGGTTT 


CAACACX3CAA GTCTTCCTGC 


7680 


ATATGGTCAA 


GAATGTACTG ACCTGCTTTC 


TTAACAAGCT 


CTTTAGCAAA TTCAAATTTA 


7740 


CTTTCCAAGA 


GAAATCTTTC 


CTTCCCCTTT 


TTCTTTGGGG 






7780 



(2} INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 4820 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GTAATGATAT AGGAACACCA GGTGACCTGA TGGGACGTCG TAAGCCTATG AACTACTAGC 60 

TGCTAAAGGC TTTAAAGATG GTATGGTACC ATATATCTCA AACCAATACG AAGAAGAAGC 120 

CAAACAAAAG GGCAAGACAA TCAATCTCTA CGGTAAAACA AGAGGTTTGG TTACAGATGA 180 

CTTGGTTTTG GAAAAGGTAT TTAATAACCA ATATCATACT TGGAGTGAGT TTAAGAAA6C 240 

TATGTATCAA GAACGACAAG ATCAGTTTGA TAGATTGAAC AAAGTTACTT TTAATGATAC 300 

AACACAGCCT TGGCAAACAT TTGCCAAGAA AACTACAAGC AGTGTAGATG AATTACAGAA 360 
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ATTAATGGAC GTTGCTC?rTC GTAAGGATGC AGAACACAAT TACT ACC ATT GGAATAACTA 420 

CAATCCAGAC ATAGATAGTG AAGTCCACAA GCTCAAGAGA GCAATCTTTA AAGCCTATCT 480 

TGACCAAACA AATGATTTTA GAAGTTCAAT TTTTGAGAAT AAAAAATAGT GTCTACTATT 540 

AGGAAATAAA GTTTAAAAAG GTGATGAAGA ACAAACCAAG ATTCAAGCAG GAATTCCTAC 600 

TGATAATGAA GTAAGTTATG ATCTTATTTA TCAGCAGGAA ACTCTTCCTG CAACAGGTTC 660 

ATCAACTTCT GAGCTTACAG CTTTAGGCCT ATTAGCTGTT GGTAGTTTAG TTCTTTTGGT 720 

TCATAATATG ACGGGAACAG TTTTTTGCTC CCTCTGAAAA GTCATCATTT GATGGCTTTT 780 

TTCTATATAG GGTAAAAGAT AGGGTAAAAG GCTATCATCG GACAAAATAA AGAAGGCATG 840 

ATATAATATA AAGTAGATTT CTATGTCATA AAACAAGAAC TGTTTGGACA TCATTCATTT 900 

GAAAACTCTC TATGTTCAAA CAATAGTAAA ATAAAATAGG GGATCTAAAT CCTTGCTATG 960 

AAAGGAAAAA ACTCAATGGC TACTATTCAA TGGTTTCCTG GTCACATGTC TAAAGCTCGT 1020 

CGACAGGTGC AGGAGAATTT AAAATTTGTT GATTTTGTGA CGATTTTAGT AGATGCACGC 1080 

TTGCCTCTAT CTAGTCAAAA TCCTATGTTG ACXTAAGATTG TTGGTGATAA ACCAAAACTC 1140 

TTGATTTTAA ACAAGGCCGA CTTGGCTGAT CCAGCAATGA CCAAGGAATG GCGTCAGTAT 1200 

TTTGAATCAC AAGGAATCCA GACGCTAGCT ATCAACTCCA AAGAGCAAGT GACTGTAAAA 1260 

GTTGTAACAG ATGCGGCCAA GAAGCTCATG GCTGATAAGA TTGCTCGCCA GAAAGAACGT 1320 

GGGATTCAGA ITGAAACCTT GCGTACTATG ATTATCGGGA TTCCAAACGC TGGTAAATCA 1380 

ACTCTGATGA ACCGTTTGGC TGGTAAAAAG ATTGCTGTTG TTGGAAACAA GCCAGGGGTC 1440 

ACAAAA6GTC AACAATGGCT TAAAACCAAT AAAGACCTGG AAATCTTGGA TACACCOGGG 1500 

ATTCTCTGGC CTAAGTTTGA GGATGAAACT GTTGCACTTA AGTTGGCATT GACTGGAGCT 1560 

ATCAAAGACC AGTTGCTTCC TATGGATGAG GTTACCATTT TTGGTATCAA TTATTTCAAA 1620 

GAACATTATC CAGAAAAGCT GGCTGAAC6C TTCAAACAAA TGAAAATTGA AGAAGAAGCG 1680 

CCTGTGATTA TTATGGATAT GACCCGCGCC CTCGGTTTCC GTGATGACTA TGACCGTTTT 1740 

TACAGTCTCT TCGTGAAGGA AGTCCGTGAT GGCAAACTCG GTAACTATAC CTTAGATACA 1800 

TTGGAAGACC TCGATGGCAA CGATTAAAGA AATCAAAGAA TTCCTTGTGA CAGTCAAGGA 1860 

GTTAGAAAGC CCTATTTTTT TAGAGCTTGA AAAGGATAAT CGCTCAGGAG TTCAAAAGGA 1920 

AATCAGCAAG CGTAAAAGAG CCATTCAAGC TGAATTAGAT GAAAATTTGC GCTTGGAATC 1980 

CATGCTTTCT TATGAAAAAG AACTTTATAA GCAAGGATTG AGCTTAATTG CAGGTATTGA 2040 

TGAGGTTGGT CGTGGTCCTC TTGCTGGTCC TGTAGTCGCT GCGGCCGTTA TTTTATCTAA 2100 
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AAATTGTAAG 


ATTAAAGGTC 


TCAACGACAG 


CAAGAAAATT 


CCTAAAAAGA AACATCTGGA 


2160 


GATTTTCCAA 


GCCGTTCAAG 


ACCAAGCCTT 


GTCGATTGGA 


ATTGGTATCA TAGATAATCA 


2220 


GGTCATCGAC 


CAAGTCAACA 


TCTATGAAGC AACCAAACTA GCCATGCAAG AAGCAATCTC 


2280 


CCAGCTCAGC 


CCTCAACCAG 


AGCACCTTTT 


GATTGATGCC 


ATGAAACTGG ACTTGCCCAT 


2340 


TTCACAAACC 


TCCATTATCA 


AAGGAGATGC 


CAACTCCCTC 


TCTATCGCAG CAGCATCTAT 


2400 


AGTAGCCAAG 


GTAACACGTG 


ATGAATTGCT 


GAAAGAATAC 


GATCAGCAGT TCCCTGGCTA 


2460 


TGATTTCGCT 


ACTAATGCAG 


GATATGGCAC 


AGCTAAACAT 


CTGGAAGGCC TCACAAAACT 


2520 


AGGAGTTACC 


CCAATTCACC 


GAACCAGCTT 


TGAACCCGTT 


AAATCACTGG TTTTAGGTAA 


2580 


AAAAGAAAGT 


TAATTGAAAG 


GAAATAACAT 


GGAGGAACAG 


TCGGAAATAG TCCGTTCTAA 


2640 


GAAAGAATTC 


GCCTTTGCAT 


CCAGCACTAT 


ACTATCCCAA 


GTTGGTCGAG GAATCATTGT 


2700 


CGGCCTCATC 


GTTGGAATTA 


TCGTCGGATC 


CTTTCGTTTC 


TTAATTGAAA AGGGCTTCCA 


2760 


CCTGATACAA 


GGAGTTTATC 


AAGATCAAGG 


GTACTTAGTG 


CGCAATCTTT TTGTACTGGT 


2820 


TTTGTTTTAT 


ATACTCATCT 


GTTGGCTCAG 


TGCCAAACTA 


ACACGGTCAG AAAAAGATAT 


2880 


TAAAGGCTCA 


GGAATTCerC 


AAGTCGAAGC 


CGAACTGAAA 


GGCCTCATGT CCCTCAACTG 


2940 


GTGGGGCATT 


CTTTGGAAAA 


AATATGTGCT 


AGGTATTCTT 


GCTATTGCCA GTGGACTCAT 


3000 


GCTGGGTCGA 


GAGGGACCCA 


GCATTCAACT 


TGGAGCAGTT 


GGTGGTAAAG GAATTGCCAA 


3060 


GTGGCTCAAA 


TCCAGTCCAG 


TAGAGGAACG 


TTCCTTGATT 


GCCAGTGGAG CTGCAGCAGG 


3120 


TTTAGCCGCA 


GCCTTTAATG 


CTCCTATTGC 


AGCACTTCTC 


TTTGTTGTAG AAGAAGTCTA 


3180 


TCACCATTTT 


TCGCGCTTTT 


TCTGGGTCTC AACTCTAGCA GCCAGCATCG TAGCAAACTT 


3240 


TGTGTCTCTA 


CTCATGTTCG 


GTTTGACACC 


AGTATTGGAT 


ATGCCAGATA ACATTCCTCC 


3300 


CATGACCCTA GATCAGTATT GGATATATCT CGTCATGGGA ATTTTCCTTG GATTTTCAGG 


3360 


TTTTCTCTAT 


GAGAAAGCTG 


TATTAAACGT 


TGGAAGAGTT 


TATGACTTGA TTGGTCAAAA 


3420 


AATCCATTTG 


GATAGGGCTT 


ATTATCCCAT 


CTTGGCTTTT 


ATCCTTATCA TACCAGTCGG 


3480 


AATCTTCTTA 


CCTCAAATCA 


TTGGTGGCGG 


AAATCAGCTT 


GTCCTTTCTT TAACTGAACA 


3540 


AAATTTTAGT 


TTCCAAGTTT 


TATTAGCTTA 


CTTTTTAATC 


CGCTTTATTT GGAGTATGAT 


3600 


TAGCTATGGA AGTGGACTGC 


CAGGAGGAAT 


TTTCCTCCCC 


ATTTTAGCTC TTGGTTCTTT 


3660 


GCTTG6TGCC TTAGTTGGTG 


TTATCTGTGT 


CAATCTTGGA 


CTTGTCAGTC AAGAGCAATT 


3720 


CCCTATATTT GTCATTCTAG GAATGAGTGG CTATTTTGGA GCCATATCAA AAGCTCCCTT 


3780 


AACCGCTATG ATCCTCGTAA 


CTGAGATGGT AGGAGATATT 


CGCAACCTTA TGCCACTTGG 


3840 


TCTTGTCACT CTTGTTTCTT 


ATATTATCAT 


GGATTTGCTC 


AAAGGTACGC CAGTCTATGA 


3900 
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AGCCATGCTG GAAAAAATGC TTCCAGAAGA AGTATCTAGC GAAGGAGAAG TTACACTTAT 3960 

CGAAATACCA GTTTCTGATA AAATTGCTGG GAAACAAGTT CATGAACTCA ACTTACXACA 4020 

CAACGTCCTC ATCACAACTC AAGTCCATAA TGGCAAGAGC CAAACAGTTA ACGGCTCAAC 4080 

CAGAATGTAT CTGGGTGATA TGATTCACCT GGTTATTCCA AAAAGTGAAA TTGGAAAAGT 4140 

CAAAGATTTG TTGTTGTAGT ATGAGTATTT ACATAATTTA TGTTATGTAA ATGATCAGTT 4200 

TGATTTATTT AGAAAACCGA TTCTCAGGAA TGAGATCGGT TATTTTTTAC TGATGAGGAA 4260 

TTTTACATAT AAATAATTGA ACTTTATTAA AAATAAGACT ATAATTAAGT TAGAAATGAT 4320 

AAAGTATAAA GCTAGAAAGG AGTTTACTGT ATCAAATCTG TACAGTAAGA TTAAAATCAT 4380 

GAAAAAGAAA ACAATAGCAA TTATATAGAG AAATGAAATA GAAATAGGAT AAAACAATCA 4440 

GGACAATCAA ATCAATTTCT AGCAATGTTT TAGAAGTCCA GATGTACTAT TCTAGTTTCA 4500 

ATCTATTATA CAATGTGTTT TGTATCTCAT AGCTCCTTAT ATAGCTCTTC AGTTATGTAG 4560 

TATTAACAGA AGTTTA6TGG GTGAGATTTT TATTATTTTC CTTATTCTGT TTTGTTTGTA 4620 

GGTCTAAGTC TTTTTATCAC TTTGAAAAAC TCCTATAACA TCTTTCCGAA AAACTATAAT 4680 

TTTCTTGAAA AATATACAAG TCTATGCTAT ACTACTAGTA TACTTACTTA TGGAGAAAAT 4740 

ACATGAAACG TGAGATTTTA CTGGAACX3AA TCGACAAACT AAAACAACTC ATGCCCTGGT 4800 

AAGTTCTGGA ATACTACCAA 4820 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CTACGACATC ATGATTAACA GTCATGCGCT ACTACCAACT GAGCTATGGC GGATAAAATA 60 

GTCCGTACGG GATTCGAACC CGTGTTACCG CCGTGAAAAG GCGGTGTCTT AACCCCTTGA 120 

CCAACGGACC TTCTATCTGT AGCAGATATA ACCATTATAT CAATTTCTTG CTAATTGTCA 180 

ATCACTTTTG AGATTTTTTC TCTAAAATAT CTTTTAATTT TCTAATTTTT AATCTTGAAA 240 

TAGGACAACG ATGGTCTTCA TAGAAAACAA TTTCTAAGTT TTTTCGATCA ATTTCTCTGA 300 

TATTACCTAT ATTTACCAAA AATGACTTGT GAGGAGAATA AAATCGCTGA GTATGTTTGT 360 

CCTTTTCCTG AATATCTGTC ATGGTACCAT AAAACTCTTT TGCAAAATTC TTACCAATAA 420 



wo 98/18931 



PCT/US97/19588 

0 



258 

TGCGCAATTT ATGAGATACC CCTGTTGTTT CAATATACAA AATATCATGG TAAGGAATTT 480 

TT7VAATCATT TCCCTTGTAA TTGTAGTCGA AATAATCTAC AACATCTTCA TTTTCAAGTA 540 

ACATACTCTT CGTGTAGAAG ATATTTTGCT CAATTCTCTT CTTAAACATC TCATCATTGA 600 

TATCCTTATC AACAAAATCT AGGGCTGATA CCTGGTATTT ATAGGTTAGA GTCGCAAACT 660 

CTGATCGACT AGTGATAAAG ACGATAATAG CGTAAGGATT GTAATGACGA ATGAGCTGAG 720 

CCACTTCAAA TCCCTTTTTC TCAATTCCAT GAATATCGAT ATCTAGGAAA TAAAGCTGAT 780 

TTACTTCATC ATTTTCAATG TATTCTTCAA ATTCACGGAC TTTTCCCGTT GTCTTGTATG 840 

ATATTGGAAT ATTCX3ATTCT TTCGAAATTT CATCCAATAT TCTCTCTAGT CTCACTTGAT 900 

GTTCAATAAC ATCTTCTAAA ATTAAAACTT TCATTCAAAT TCCCTCTTAA ATCTAATGAT 960 

TTGTCTAAAT GTACTGCCTT CCATCTCTGT TTCTAAAATA ATATTGTTGT ACTTATCTAG 1020 

TAGTPCTTTC ACATTATTTA ATCCGACTCC GCGATTTCTT CCCTTAGTGG AGAATCCTAA 1080 

GGCAAATAGA TCTCCTGAAG GAGTCATCGT CATTTTACAT GAATTCTGAA TCACAATAAC 1140 

TGTTTCAGTT TCCATCTTAA TAACTGCTAC TTCCATCTGC TTTTTATAGC TATCAGCCGA 1200 

TCCTTCGACA GCATTATTCA ATAAAACGCT CATGATACGA ACCAAATCCA ATAGTTCAAT 1260 

TGGAAGCTTG GTAATCGTAT CTTTTACTTC CAGTGTAAAC TCTACACCAT TATTTCGAGC 1320 

ATAGACAATT GACT6AGCAA CCAAACTTCG TAAAGCTGAG TCTTCTATGT TGTTCAAATC 1380 

AAAGTAAGTG TACTTATCTG AACGCAATTT ATGATTTGCT TTGACTAAAA CTTCATTGTA 1440 

AATTCTGTCA ATTTCCTGTA AATTACCACT GTCAATTGCC ATCTGCATGC TGACAAGCAT 1500 

TCCAGCATAA TCATGTCGAA AACCACGGAT TTCATTATAC AGACCAACAA TTTCATCTGT 1560 

GTAATTCTGT AAATGTTTCT GTTCAAATTT CTTCTGCTTC AAAGCAATCT CTTTCTCCAT 1620 

TTGAACTTTA TGAGAATTCA TTGCAAAGAA GGTCAAAAGG AGAGAGATAA AGACAATAGA* 1680 

TGACAAAATA CTTCCAAAAC TATTCAAATG TTTAATCGTA CTTACCATAT CTGAAACGAA 1740 

AGATACAATA TGTAGCAATA GTAAAGCAAA AAATACTTTT TTCAAGAAAG GATAAAGGTA 1800 

GTCCTTGTCA AAATAGGCTA GTTCCAAATG GAAATAGTAA ATGATTTTTA ATGTAACAAA I860 

ATAGGTTAAC ACCGTCACAA CGAAAAAGAA TGGGAAATGA TATTGTAAAA CAAAATTATC 1920 

TCCTGTTATA GAGGAGAAAA TTACGGACAG AAAGTTATGA GTGCTCTCAT ATAAAAGAGA 1980 

TAGTAGTAAA CTTAGGAATA GTCCTCTATC CXTTCTCATAC TGTTTCATCC ATCGAAAATA 2040 

GGAATATAAG CCCAAAGGAA ATAAAAATCT TTCAATCCCT ATTTTATCTA AATATAGAAG 2100 

ATAAAAGGAA AATTCAAGTA CTATTTCAGT TAGTAATGTA TAAGCACCAA AAACGTATAA 2160 

TTCTTTTCTA TTTATTCGAC CTTTACAAAT TAAACGGTAA CTGTGACTAA TAATTAAAAA 2220 
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ATGAACAATA ACTGTCCCAA ATCCAAGTAA ATCCATTACT CTTTCTCCTT ATTTCATTAC 2280 

TTTTTTCGTA GGAAAAGAAA ATCAAGGATG ATTCTTGAAA TCCTCATCTC CCCACCTTTA 2340 

ATCTTTTGTA AGTCTTTTTC CTTCAAAGCT ACAAACTGTT CCAATTTAAC TGTGTTTTTC 2400 

ATAATAAAAT CTCCTAAAAT GTTTTTTCTT GTAAGCTAAC TTACAAAAAC CATTATACAA 2460 

AATGGAATTT CGTTTTAGAT AAAATTCTCT CAACTGTCAT TTTTTTCTCC CAAAGTGTAC 2520 

TTTTTTAAGA AAAAAGCCGG GAAAATTCCC AGCTTTGCTA TTATATTGAT CCCAGCAGGA 2580 

TTCGAACCTG CGACCGTTCG CTTAGAAGGC GAATGCTCTA TCCAGCTGAG CTATGAGACC 2640 

TAATACAATT ATTCTACCAA AAATTCAATT AAAAGTCAAT TTTCTATTTA TGGTAGGGGA 2700 

ATCCCTGCTG AATCX3TAAAA GOGCGATAGA TTTGTTCAAC AAGAACTAGT CTCATTAACT 2760 

GATGGGGTAA GGTTAG6CGA CCAAAACTGA CAGAAAGATT GGCTCTATTT TTTACAGATG 2820 

ATGATAATCC TAAACTTCCC CCAATAATAA AAGTAAGAGT AGAAAATCCT TTTATAGAAG 2880 

TTTCTTCTAA CTGCTTACTA AATTCTTCTG AGAAGAAAGT TTTCCCTTCA ATGGCTAACA 2940 

CAATAACGAA ATCACGGTCA GCAATTTTTG ATAAAATTCT CTGACCTTCT ATTTCTAAAA 3000 

TCTTTTGATT TTCTGATTCA CTGGCCTTAT CTGGTGTTTT TTCATCTGAT AACTCAATCA 3060 

TTTCAAACTT AGCAAATCTA GAAATTCGTT TTGAATACTC TGCGATACCA TCTTTTAAAT 3120 

ACTTTTCTTT CAGTTTCCCA ACTGTTACAA CTTTAATTTT CATGACTCTA TTCTAACATA 3180 

TTCTCTATTT TTTCACATCT TATTCACAAA ATAAAAAATA GATTTCAATT AAGAAAATCA 3240 

CAATTTCAAA AGA6TTATCC ACAGTTTGTG TAAAACTTTT GTGTTTAAGT TATAATTAAG 3300 

CTAGTCAGTT TATACTTTCA GTAATTCAAA CATATGGAGG CAAATATGAA ACATCTAAAA 3360 

ACATTTTACA AAAAATGGTT TCAATTATTA GTCXSTTATCG TCATTAGCTT TTTTAGTGGA 3420 

GCCTTGGGTA GTTTTTCAAT AACTCAACTA ACTCAAAAAA GTAGTGTAAA CAACTCTAAC 3480 

AACAATAGTA CTATTACACA AACTGCCTAT AAGAACGAAA ATTCAACAAC ACAGGCTGTT 3540 

AACAAAGTAA AAGATGCTGT TGTTTCTGTT ATTACTTATT CGGCAAACAG ACAAAATAGC 3600 

GTATTTGGCA ATGATGATAC TGACACAGAT TCTCAGCGAA TCTCTAGTGA AGGATCTGGA 3660 

GTTATTTATA AAAAGAATGA TAAAGAAGCT TACATCGTCA CCAACAATCA CGTTATTAAT 3720 

GGCGCCAgCA AAGTAGATAT TCGATTGTCA GATGGGACTA AAGTACCTGG AGAAATTGTC 3780 

GGAGCTGACA CTTTCTCTGA TATTGCTGTC GTCAAAATCT CTTCAGAAAA AGTGACAACA 3840 

GTAGCTGAGT TTGGTGATTC TAGTAAGTTA ACTGTAGGA6 AAACTGCTAT TGCCATCGGT 3900 

AGCCCGTTA6 GTTCTGAATA T6CAAATACT GTCACTCAAG GTATCGTATC CAGTCTCAAT 3960 
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AGAAATGTAT 


CCTTAAAATC 


GGAAGATGGA 


CAAGCTATTT 


CTACAAAAGC 


CATCCAAACT 


4020 


GATACTGCTA 


TTAACCCAGG 


TAACTCTGGC 


GGCCCACTGA 


TCAATATTCA 


AGGGCAGGTT 


4080 


ATCGGAATTA 


CCTCAAGTAA 


AATTGCTACA 


AATGGAGGAA 


CATCTGTAGA 


AGGTCTTGGT 


4140 


TTCGCAATTC 


CTGCAAATGA 


TGCTATCAAT 


ATTATTGAAC 


AGTTAGAAAA 


AAACGGAAAA 


4200 


GTGACGCGTC 


CAGCTTTGGG 


AATCCAGATG 


GTTAATTTAT 


CTAATGTGAG 


TACAAGCGAC 


4260 


ATCAGAAGAC 


TCAATATTCC 


AAGTAATGTT 


ACATCTGGTG 


TAATTGTTCG 


TTCGGTACAA 


4320 


AGTAATATGC 


CTGCCAATGG 


TCACCTTGAA 


AAATACXSATG 


TAATTACAAA 


AGTAGATGAC 


4380 


AAAGAGATTG 


CTTCATCAAC 


AGACTTACAA 


AGTGCTCTTT 


ACAACCATTC 


TATCGGAGAC 


4440 


ACCATTAAGA 


TAACCTACTA 


TCGTAACGGG 


AAAGAAGAAA 


CTACCTCTAT 


CAAACTTAAC 


4500 


AA6AGTTCAG 


GT6ATTTAGA 


ATCTTAATTG 


ACATCTATGT 


AAAGAAAGCT 


TTACATAAGA 


4560 


GAAAA6ATGT 


GTTAGTGTAG 


AATCATGGAA 


AAATTTGAAA 


TGATTTCTAT 


CACAGATATA 


4620 


CAAAAAAATC 


CCTATCAACC 


CC6AAAAGAA 


TTTGATAGAG 


AAAAACTAGA 


TGAACTAGCA 


4680 


CAGTCTATCA 


AAGAAAATGG 


GGTCATTCAA 


CCGATTATTG 


TTCGTCAATC 


TCCTGTTATT 


4740 


GGTTATGAAA 


TCcTTGCAGG 


AGAGAGACGC 


TATCGGGCTT 


CACTTTTAGC 


TGGTCTACGG 


4800 


TCTATCCCAG 


CTGTTGTTAA 


ACAGATTTCA 


GACCAAGAGA 


TGATGGTCCA 


GTCCATTATT 


4860 


GAAAATTTAC 


AGAGAGAAAA 


TTTAAACCCA ATAGAAGAAG 


CACGCGCCTA 


TGAATCTCTC 


4920 


GTAGAGAAAG 


GATTCACCCA 


TGCTGAAATT 


GCAGATAAGA 


TGGGCAAGTC 


TCGTCCATAT 


4980 


ATCAGCAACT 


CCATTCGTTT 


ACTTTCCTTG 


CCAGAACAGA 


TTCTTTCAGA AGTAGAAAAT 


5040 


GGCAAACTAT 


CACAAGCCCA 


TGCGCGTTCC 


CTAGTTGGGT 


TAAATAAGGA ACAACAAGAC 


5100 


TATTTCTTTC 


AACGGATTAT 


AGAAGAAGAT 


ATTTCTGTAA 


GGAAATTAGA AGCTCTTCTG 


5160 


ACAGAGAAAA 


AACAAAAGAA ACAGCAAAAA ACTAATCATT 


TCATACAAAA 


TGAAGAAAAA 


5220 


CAGTTAAGAA 


AACTACTCGG 


ATTAGATGTA 


GAAATT/UVAC 


TATCTAAAAA 


AGACAGTGGA 


5280 


AAAATCATTA 


TTTCTTTTTC 


AAATCAAGAA 


GAATATAGTA 


GAATTATCAA 


CAGCCTGAAA 


5340 


TAAGGCTGTT 


CTTTTATTTT 


TTTATCTCAC 


AAGGTTATCC 


ACTATGTTTT 


TCGATAAAAA 


5400 


GCTTAATAAA 


TCAATAATTT 


CTTCTTTTAT 


CCCCAACCTG 


TGGATAAAGT 


TTGGTAACAT 


5460 


TGTGGATTAT 


TTTTCACAGC 


TTGTGGAAAA 


TTCTTGCTAT 


CTATGGTAAA 


ATATCTCTAG 


5520 


TATTAAACTT 


TTAAATAGTA 


AAGGAG6AGA 


AAGGATTGAA 


AGAAAAACAA 


TTTTGGAATC 


5580 


GTATATTAGA 


ATTTGCACAA 


GAAA6ACTGA 


CTCGATCCAT 


GTATGATTTC 


TATGCTATTC 


5640 


AAGCTGAACT 


CATCAAGGTA 


GA6GAAAAT6 


TTGCCACTAT 


ATTTCTACCT 


CGCTCTGAAA 


5700 


TGGAAATGGT 


COXSGGAAAAA CAACTAAAAG ATATTATTGT AGTAGCTGGT TTTGAAATTT 


5760 
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ATGACGCTGA AATAACTCCC 


CACTATATTT 


TCACCAAACC 


TCAAGATACG ACTAGCTCAC 


5820 


AAGTTGAAGA AGCTACAAAT 


TTAACTCTTT 


ATAACTATAG 


TCCAAAGTTA GTATCTATTC 


5880 


CTTATTCAGA TACGGGATTA 


AAAGAAAAGT 


ATACCTTTGA 


TAACTTTATT CAAGGGGATG 


5940 


GAAATGTTTG GGCTGTATCA 


GCCGCTTTAG 


CTGTCTCTGA 


AGATTTGGCT CTGACCTATA 


6000 


ACCCTCTTTT TATCTATGGA 


GGACCAGGCC 


TTGGTAAGAC 


TCACTTATTA AACGCTATTG 


6060 


GAAATGAAAT TCTAAAAAAT 


ATTCCTAATG 


CGCGTGTTAA 


ATATATCCCT GCCGAAAGCT 


6120 


TTATTAATGA CTTTCTTGAT 


CACCTAAGAC 


TTGGGGAAAT 


GGAAAAGTTT AAAAAGACCT 


6180 


ATCC3TAGTCT TGATCTTTTG 


TTAATCGATG 


ATATCCAGTC 


ACTCAGCGGA AAAAAAGTCG 


6240 


CAACTCAGGA AGAATTTTTC 


AATACCTTTA- 


ACGCCCTTCA 


TGACAAGCAA AAACAGATTG 


6300 


TCCTAACGAG TGATCGTAGT 


CCAAAACATC 


TAGAAGGGCT 


CGAGGAGAGG CTTGTCACGC 


6360 


GTTTTAGTTG GGGATTGACA 


CAAACTATCA 


CCCCCCCTGA 


CTTTGAAACA CGTATTGCCA 


6420 


TTTTACAAAG TAAGACGGAA 


CATTTAGGCT 


ACAATTTCCA 


AAGTGATACT CTAGAATACC 


6480 


TAGCTGGGCA ATTTGATTCA AATGTTCGAG 


ATCTTGAGGG 


AGCCATCAAC GACATCACTT 


6540 


TAATTGCCAG AGTAAAAAAA 


ATCAAGGATA 


TCACTATTGA 


TATTGCTGCA GAAGCCATTA 


6600 


GAGCCCGCAA ACAAGATGTT 


AGCCAAATGC 


TCGTCATCCC 


AATTGATAAA ATCCAAACTG 


6660 


AAGTTGGTAA CTTTTATGGT 


GTTAGTATCA 


AAGAAATGAA 


GGGAAGTAGA CGCCTTCAAA 


6720 


ATATTGTTTT GGCCCGTCAA 


GTAGCCATGT 


ATTTATCTAG 


AGAACTAACA GATAATAGTC 


6780 


TTCCAAAAAT TGGGAAGGAA 


TTTGGGGGAA 


AAGATCATAC 


CACAGTCATT CATGCCCATG 


6840 


CCAAAATAAA ATCTTTGATT GATCAAGACG 


ATAATTTACG 


TTTAGAAATT GAATCAATCA 


6900 


AAAAGAAAAT CAAATAATTT 


GTGGATAACT 


TTTAGTTTTT 


TATCTTTTTT ATCCACATTT 


6960 


TTTAAACAAG CTAAAAAACT TGATATGACT 


TGTTTAAAGG 


CTGTTTTCCA CAGATTTCAC 


7020 


AGACTCTATT ATTACTATTA TCTTTCTAAT ACTAAAAATA AATAAAGGAG AATCCATGAT 


7080 


TCATTTTTCA ATTAATAAAA ATTTATTTCT 


ACAAGCATTA 


AATACTACTA AGAGAGCTAT 


7140 


TAGTTCTAAA AATGCCATTC 


CTATTTTATC 


AACAGTAAAA 


ATTGACGTGA CCAATGAAGG 


7200 


TATTACTTTA ATTGGTTCAA 


ATGGTCAAAT 


TTCAATTGAA 


AATTTTATTT CTCAAAAAAA 


7260 


TGAAGATGCT GGTTTGTTAA 


TTACTTCTTT 


AGGTTCGATC 


CTTCTTGAAG CTTCTTTCTT 


7320 


TATCAATGTA GTATCTAGTT 


TACCTGATGT 


AACTCTTGAT 


TTTAAAGAAA TT6AACAAAA 


7380 


TCAAATTGTT TTAACCAGTG 


GCAAATCAGA AATTACCCTA 


AAAGGAAAA6 ATAGCGAACA 


7440 


ATATCCACGA ATCCAA6AAA 


TTTCAGCAAG 


CACTCCTTTA 


ATACTTGAAA CAAAATTACT 


7500 
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CAAGAAAATT ATTAATGAAA CAGCCTTTGC TGCAAGTACA CAAGAGAGTC GTCCGATTTT 7560 

AACAGGTGTC CACTTCGTAT TGAGTCAACA CAAAGAGTTA AAAACAGTTG CAACAGACTC 7620 

TCATCGCCTA AGCCAGAAAA AATTGACTCT TGAAAAAAAT AGTGATGATT TTGATGTCGT 7680 

AATTCCTAGC CGTTCTCTAC GCGAATTTTC AGCGGTATTT ACAGATGATA TCGAAACTGT 7740 

AGAGATTTTC TTTGCCAATA ACCAAATCCT CTTTAGAAGC GAAAATATTA GCTTCTATAC 7800 

TCGTCTCCTA GAAGGAAACT ATCCTGATAC AGATCGCTTG ATTCCAACAG ACTTTAACAC 7860 

TACTATTACT TTTAATGTGG TAAACTTACG CCAGTCAATG GAGCGTGCCC GTCTTTTATC 7920 

AAGTGCGACT CAAAATGGTA CTGTGAAACT TGAAATTAAG GATGGGGTTG TTAGCGCCCA 7980 

TGTTCACTCT CCAGAAGTTG GTAAAGTAAA CGAAGAAATC GATACTGATC AGGTTACTGG 8040 

TGAAGATTTG ACCATTAGTT TCAACCCAAC TTACTPGATT GATTCTCTTA AAGCTTTAAA 8100 

TAGCGAAAAG GTGACTATTA GCTTTATCTC AGCTGTTCGT CCATTTACTC TTGTGCCAGC 8160 

AGATACTGAC GAAGACTTCA TGCAGCTCAT TACACXrAGTT CGTACAAATT AAGTGAAAGA 8220 

GGTTGAGCCT GGCTCGCCTC TTTTATGATA TAATCGAAAA AGAAAAGGAG AGTAGTATGT 8280 

ATCAAGTTGG AAATTTTGTT GAGATGAAAA AATCACACGC TTGTACAATC AAGTCGACTG 8340 

GTAAAAAGGC TAATCGTTGG GAAATTACAC GTGTAGGAGC AGATATCAAA ATAAAATGTA 8400 

GTAATTGTGA GCATGTTGTC ATGATGGGGC GATATGATTT TGAGCGAAAA ATGAATAAAA 8460 

TTATTGACTG AGAACCCTTA GTTA6AGGGT TAGCACTTTA TCCCTTTTTG TGTTATAATA 8520 

TTAGGGATTG AAATGAAAAC GGAGAATGAG AAATATGGCT TTGACA6CAG GTATCGTTGG 8580 

TTTGCCAAAC GTTGGTAAAT CAACACTATT TAATGCAATT ACAAAAGCAG GAGCAGAGGC 8640 

AGCAAACTAC CCATTTGCGA CGATTGATCC AAATGTTGGA ATGGTGGAAG TTCCAGATGA 8700 

ACGCCTACAA AAACTAACTG AAATGATAAC TCCTAAAAAG ACAGTTCCCA CAACATTTGA 8760 

ATTTACAGAT ATTGCAGGGA TTGTAAAAGG AGCTTCAAAA GGAGAGGGGC TAGGGAATAA 8820 

ATTCTTGGCC AATATTCGTG AAGTAGATGC GATTGTTCAC GTAGTTCGTG CTTTTGATGA 8880 

TGAAAATGTA ATGCGCGAGC AAGGACGTGA AGACGCCTTT GTAGATCCAC TTGCAGATAT 8940 

TGATACCATT AATCTGGAAT TGATTCTTGC TGACTTAGAA TCAGTGAACA AACGATATGC 9000 

GCGTGTAGAA AAGATGGCAC GTACGCAAAA AGATAAA6AA TCAGTAGCAG AATTCAATGT 9060 

TCTTCAAAAG ATTAAACCAG TCCTAGAAGA CGGGAAATCA GCTCGTACCA TTGAATTTAC 9120 

AGATGAGGAA CAAAAGGTTG TCAAAGGTCT TTTCCTTTTG AOGACTAAAC CAGTTCTTTA 9180 

TGTAGCTAAT GTGGACGAGG ATGTGGTTTC AGAACCTGAC TCTATCGACT ATGTCAAACA 9240 

AATTCGTGAA TTTGCAGCGA CAGAAAATGC TGAAGTAGTC GTTATTTCTG CGCGTGCTGA 9300 
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GGAAGAAATT TCTGAATTGA ATGATGAAGA TAAAAAAGAG TTTCTTGAAG CCATTGGTTT 9360 

GACAGAATCA GGTGTAGATA AGTTGACGCG TGCAGCTTAC CACTTGCTTG GATTGGGAAC 9420 

TTACTTCACA GCTGGTGAAA AAGAAGTTCG CGCTTGGACT TTCAAACGTG GTATGAAGGC 9480 

TCCTCAAGCA GCTGGTATTA TCCACTCAGA CTTTGAAAAA GGCTTTATTC GTGCAGTAAC 9540 

CATGTCATAT GAAGATCTAG TGAAATACGG ATCTGAAAAG GCCGTAAAAG AAGCTGGACG 9600 

CTTGCGTGAA GAAGGAAAAG AATATATCGT TCAAGATGGC GATATCATGG AATTCCGCTT 9660 

TAATGTCTAA AAATTAATAA ATGGTGTCAA TTAGGTTGGA AAAAAATTCC AACCCTTTTG 9720 

GCTTTTGAAA GGAAAAATAA ATGACCAAAT TACTTGTAGG CTTGGGAAAT CCAGGGGATA 9780 

AATATTTTGA AACAAAACAC AATGTTGGTT TTATGTTGAT TGATCAACTA GCGAAGAAAC 9B40 

AGAATGTCAC TTTTACACAC GATAAGATAT TTCAAGCTGA CCTAGCATCC TTTTTCCTAA 9900 

ATGGAGAAAA AATTTATCTG GTTAAACCAA CGACCTTTAT GAATGAAAGT GGAAAAGCAG 9960 

TTCATGCTTT ATTAACTTAC TATGGTTTGG ATATTGACGA TTTACTTATC ATTTACGATG 10020 

ATCTTGACAT OGAAGTTGGG AAAATTCGTT TAAGAGCAAA AGGCTCAGCA GGTGGTCATA 10080 

ATGGTATCAA GTCTATTATT CAACATATAG GAACTCAGGT CTTTAACCGT GTTAAGATTG 10140 

GAATTGGAAG ACCTAAAAAT GGTATGTCAG TTGTTCATCA TGTTTTGAGT AAGTTTGACA 10200 

GGGATGATTA TATCGGTATT TTACAGTCTG TTGACAAAGT TGACGATTCT GTAAACTACT 10260 

ATTTACAAGA GAAAAATTTT GAGAAAACAA TGCAGAGGTA TAACGGATAA ATGGTGACCT 10320 

TATTAGATTT ATTCTCAGAA AATGATCAGA TTAAAAAATG GCATCAAAAT TTAACAGATA 10380 

AGAAAAGACA ACTAATACTT GGTTTATCAA CATCTACTAA GGCTCTTGCA ATTGCAAGCA 10440 

GTTTAGAAAA AGAAGATAG6 ATTGTGTTAT TGACGTCAAC TTATGGAGAA GCAGAAGGAC 10500 

TTGTTAGTGA TCTTATTTCT ATCTTGGGTG AGGAACTCGT CTATCCATTT TTGGTAGATG 10560 

ATGCTCCTAT GGTGGAGTTT TTGATGTCTT CACAGGAAAA AATTATTTCA CGGGTTGAAG 10620 

CCTTGCGTTT TTT6ACTGAT TCATCTAAGA AAGGGATTTT AGTTTGTAAT ATCGCAGCAA 10680 

GTCGATTSAT TTTACCGTCT CCCAATGCAT TCAAAGATAG TATTGTAAAA ATCTCAGTTG 10740 

GTGAAGAATA TGATCAACAC GCGTTTATCC ATCAGTTAAA GGAAAATGGC TATCGAAAAG 10800 

TTACTCAAGT ACAAACTCAG GGCGAATTTA GTCTTCGAGG AGATATTTTA GATATTTTTG 10860 

AAATATCCCA GTTAGAACCT TGTCGAATTG AGTTTTTTGG TGATGAAATT GATGGTATCA 10920 

GGTCATTTGA AGTAGAAACA CAATTATCGA AAGAAAATAA GACAGAACTC ACTATCTTTC 10980 

CAGCTAGTGA TATGCTTTT6 AGAGAAAAGG ATTATCAACG AGGACAGTCA GCTTTAGAAA 11040 
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AACAAATTTC AAAAACTTTA TCACCTATTT TGAAATCATA CCTAGAAGT^ ATTCTTTCAA 11100 

GTTTTCACCA AAAACAAAGT CATGCAGACT CTCGGAAGTT TTTATCTTTG TGCTATGATA 11160 

AGACATGGAC TGTCTTTGAT TATATTGAAA AAGATACTCC AATATTCTTT 6ATGATTATC 11220 

AAAAATTGAT GAATCAGTAT GAAGTCTTTG AAAGAGACTT AGCGCAGTAC TTTACAGAAG 11280 

AATTACAGAA TAGTAAAGCA TTTTCTGATA TGCAGTATTT TTCTGATATT GAACAAATCT 11340 

ATAAAAAACA AAGTCCAGTG ACCTTTTTCT CTAATCTTCA AAAGGGTTTA GGAAATCTCA 11400 

AATTTGACAA AATTTATCAA TTCAATCAAT ATCCTATGCA GGAATTTTTC AATCAGTTTT 11460 

CTTTTCTAAA A6AAGAAATT GAACGATATA AAAAAATGGA TTACACCATT ATTCTGCAGT 11520 

CTAGCAATTC AATGGGAAGT AAAACATTGG AGGATATGTT AGAGGAATAT CAGATTAAAT 11580 

TGGATTCTAG AGATAAGACA AATATCTGTA AAGAATCTGT AAACTTAATA GAGGGTAATC 11640 

TCAGACATGG TTTTCATTTT GTAGATGAAA AGATTTTATT GATAACTGAA CATGAGATTT 11700 

TTCAAAAGAA ATTAAAGCGT CGTTTTCGAA GACAACATGT TTCAAATGCA GAGAGATTAA 11760 

AAGATTACAA TGAACTTGAA AAAGGGGACT ATGTTGTCCA TCATATCCAT GGGATTGGTC 11820 

AATATCTAGG AATTGAAACC ATTGAAATCA AGGGAATTCA TCGCGATTAT GTCAGTGTCC 11880 

AATACCAAAA TGGTGATCAA ATTTCTATCC CCGTGGAACA GATTCATCTA CTGTCCAAAT 11940 

ATATTTCAAG TGATGGTAAA GCTCCAAAAC TCAATAAATT AAATGACGGT CATTTTAAAA 12000 

AGGCCAAGCA AAAGGTTAAG AACCAG6TAG AGGATATAGC TGATGATTTA ATCAAACTCT 12060 

ACTCTGAACG TAGTCAGTTG AAGGGTTTTG CTTTCTCAGC TGATGATGAT GATCAAGATG 12120 

CCTTTGATGA TGCTTTCCCT TATGTTGAAA CGGATGATCA ACTTCGTAGT ATTGAGGAAA 12180 

TCAAGAGGGA TATGCAGGCT TCTCAGCCAA TGGATCGACT TTTAGTTGGG GATGTTGGTT 12240 

TTGGAAAGAC TGAAGTTGCT ATGCGTGCAG CCTTTAAAGC AGTCAATGAT CACAAACAGG 12300 

TTGTCATTCT AGTTCCGACG ACGGTTTTAG CGCAACAGCA CTATACGAAT TTTAAGGAAC 12360 

GATTCCAAAA TTTTGCAGTT AATATTGATG TGTTGAGTCG CTTTAGAAGT AAAAAAGAGC 12420 

AGACTGCAAC ACTTGAAAAA TTGAAAAACG GTCAAGTCGA TATTTTGATT GGAACACATC 12480 

GTGTTTTGTC AAAAGATGTT GTGTTTGCTG ATTTGGGCTT GATGATTATT GATGAGGAAC 12540 

AGCGATTTGG TGTCAAGCAT AAGGAAACTT TGAAAGAACT GAAGAAACAA GTGGATGTCC 12600 

TAACCTTGAC CGCTACGCCA ATCCCTCGTA CCCTCCATAT GTCTATGCTG GGAATCAGAG 12660 

ATTTATCTGT TATTGAAACT CCGCCGACTA ATCGCTATCC TGTTCAGACC TATGTTTTGG 12720 

AAAAGAAT6A TAGTGTCATT CGTGATGCTG TCTTGCGTGA AATGGAGCGT GGAGGTCAAG 12780 

TfPTATTATCT TTACAACAAA GTTGACACAA TTGTTCAGAA GGTTTCAGAA TTACAGGAGT 12840 
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TGATTCCGGA GGCTTCGATT GGATATGTTC ATGGTCGAAT GAGTGAAGTC CAGTTGGAAA 12900 

ATACTCTATT AGACTTTATT GAGGGACAAT ACGATATCTT GGTGACGACT ACTATTATTG 12960 

AGACAGGGGT GGACATTCCA AATGCTAATA CTTTATTTAT TGAAAATGCG GACCATATGG 13020 

GCTTGTCAAC CTTATATCAG TTAAGAGGAA GAGTCGGTCG TAGTAATCGT ATTGCTTATG 13080 

CTTATCTCAT GTATCGTCCA GAAAAATCAA TCAGTGAAGT CTCTGAAAAG AGATTAGAAG 13140 

CGATTAAAGG ATTTACAGAA TTGGGCTCTG GCTTTAAGAT TGCAATGCGA GATCTTTCGA 13200 

TTCGTGGAGC AGGAAATCTT TTAGGAAAAT CCCAGTCTGG TTTCATTGAT TCTGTTGGTT 13260 

TTGAATTGTA TTCGCAGTTA TTAGAGGAAG CTATTGCTAA ACGAAACGGT AATGCTAACG 13320 

CTAACACAAG AACCAAAGGG AATGCTGAGT TGATTTTGCA AATTGATGCC TATCTTCCTG 13380 

ATACTTATAT TTCTGATCAA CGACATAAGA TTGAAATTTA CAAGAAAATT CGTCAAATTG 13440 

ACAACCGTGT CAATTATGAA GAGTTACAAG AGGAGTTGAT AGACCGTTTT GGAGAATACC 13500 

CAGATGTAGT AGCCTATCTG TTAGAGATTG GTTTGGTCAA ATCATACTTG GACAAGGTCT 13560 

TTGTTCAACG TGT6GAAAGA AAAGATAATA AAATTACAAT TCAATTTGAA AAAGTCACTC 13620 

AACGACTGTT TTTAGCTCAA GATTATTTTA AAGCTTTATC CGTAACGAAC TTAAAAGCAG 13680 

GCATCGCTGA GAATAAGGGA TTAATGGAGC TTGTATTTGA TGTCCAAAAT AAGAAAGATT 13740 

ATGAAATTTT AGAAGGTTTG CTGATTTTTG GAGAAAGTTT ATTAGAGATA AAAGAGTCTA 13800 

AGGAAGAAAA TTCCATTTGA TATTTTTCTT CTATAAAATA GATAAAAATG GTACAATAAT 13860 

AAATTGAGGT AATAAGGATG AGATTAGATA AATATTTAAA AGTATCGCGA ATTATCAAGC 13920 

GTCGTACAGT CGCAAAOGAA 6TAOCAGATA AAGGTAGAAT CAAGGTTAAT GGAATCTTGG 13980 

CCAAAAGTTC AACGGACTTG AAAGTTAAT6 ACCAAGTTGA AATTCOCTTT GGCAATAAGT 14040 

TGCTGCTTGT AAAAGTACTA GAGATGAAAG ATAGTACAAA AAAAGAAGAT GCAGCAGGAA 14100 

TGTATGAAAT TATCAGTGAA ACACGG6TAG AAGAAAATGT CTAAAAATAT TGTACAAT^G 14160 

AATAATTCTT TTATTCAAAA TGAATACCAA CGTC6TC6CT ACCTGATGAA AGAACGACAA 14220 

AAACGGAATC GTTTTATGGG AGGGGTATTG ATTTTGATTA TGCTATTATT TATCTTGCCA 14280 

ACTTTTAATT TAGCGCAGAG TTATCAGCAA TTACTCCAAA GACGTCAGCA ATTAGCAGAC 14340 

TTGCAAACTC AGTATCAAAC TTTGAGTGAT GAAAAGGATA AGGAGACAGC ATTTGCTACC 14400 

AAGTTGAAAG ATGAAGATTA TGCTGCTAAA TATACACGAG CGAAGTACTA TTATTCTAAG 14460 

TCGAGGGAAA AAGTTTATAC GATTCCTGAC TTGCTTCAAA GGTGATAAAA TGGAAAATTT 14520 

ATTAGACGTA ATAGAGCAAT TTTTGAGTTT GTCAGATGAA AAGCTGGAAG AATTGGCTGA 14580 
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TAAAAATCAA TTATTGOGTT TACAAGAAGA AAAGGAAAGG AAGAATGCGT AAATTCTTAA 14640 

TTATTTTGTT GCTACCAAGT TTTTTGACCA TTTCAAAAGT CGTTAGCACA GAAAAAGAAG 14700 

TCGTCTATAC TTCGAAAGAA ATTTATTACC TTTCACAATC TGACTTTGGT ATTTATTTTA 14760 

GAGAAAAATT AAGTTCTCCC ATGGTTTATG GAGAGGTTCC TGTTTATGCG AATGAAGATT 14820 

TAGTAGTGGA ATCTGGGAAA TTGACTCCCA AAACAAGTTT TCAAATAACC GAGTGGCGCT 14880 

TAAATAAACA AGGAATTCCA GTATTTAAGC TATCAAATCA TCAATTTATA GCTGCGGACA 14940 

AACXSATTTTT ATATGATCAA TCAGAGGTAA CTCCAACAAT AAAAAAAGTA TGGTTAGAAT ISOOO 

CTGACTTTAA ACTGTACAAT AGTCCTTATG ATTTAAAAGA AGTGAAATCA TCCTTATCAG 15060 

CTTATTCGCA AGTATCAATC GACAAGACCA TGTTTGTAGA AGGAAGAGAA TTTCTACATA 15120 

TTGATCAGGC TGGATGGGTA GCTAAAGAAT CAACTTCTGA AGAAGATAAT CGGATGAGTA 15180 

AAGTTCAAGA AATGTTATCT GAAAAATATC AGAAAGATTC TTTCTCTATT TATGTTAAGC 15240 

AACTGACTAC TGGAAAAGAA GCTGGTATCA ATCAAGATGA AAAGATGTAT GCAGCCAGCG 15300 

TTTTGAAACT CTCTTATCTC TATTATACGC AAGAAAAAAT AAATGAGG6T CTTTATCAGT 15360 

TAGATACGAC TGTAAAATAC GTATCTGCAG TCAATGATTT TCCAGGTTCT TATAAACCAG 15420 

AGGGAAGTGG TAGTCTTCCT AAAAAAGAAG ATAATAAAGA ATATTCTTTA AAGGATTTAA 15480 

TTACGAAAGT ATCAAAAGAA TCTGATAATG TAGCTCATAA TCTATTGGGA TATTACATTT 15540 

CAAACCAATC TGATGCCACA TTCAAATCCA AGATGTCTGC CATTATGGGA GATGATTGGG 15600 

ATCCAAAAGA AAAATTGATT TCTTCTAAGA TGGCCGGGAA GTTTATGGAA GCTATTTATA 15660 

ATCAAAATGG ATTTGTGCTA GAGTCTTTGA CTAAAACAGA TTTTGATAGT CAGCGAATTG 15720 

CCAAAGGTGT TTCTGTTAAA GTAGCTCATA AAATTGGAGA TGCGGATGAA TTTAAGCATG 15780 

ATACGGGTGT TGTCTATGCA GATTCTCCAT TTATTCTTTC TATTTTCACT AAGAATTCTG 15840 

ATTATGATAC GATTTCTAAG ATAGCCAAGG ATGTTTATGA GGTTCTAAAA TGAGGGAACC 15900 

AGATTTTTTA AATCATTTTC TCAAGAAGGG ATATTTCAAA AAGCATGCTA AGGCGGTTCT 15960 

AGCTCTTTCT GGTGGATTAG ATTCCATGTT TCTATTTAAG GTATTGTCTA CTTATCAAAA 16020 

AGAGTTAGAG ATTGAATTGA TTCTAGCTCA TGTGAATCAT AAGCAGAGAA TTGAATCAGA 16080 

TTGGGAA6AA AAGGAATTAA GGAAGTTGGC TGCTGAAGCA GAGCTTCCTA TTTATATCAG 16140 

CAATTTTTCA GGAGAATTTT CAGAAlGCGCG TGCACGAAAT TTTCGTTATG ATTTTTTTCA 16200 

AGAGGTCATG AAAAAGACAG GTGCGACA6C TTTAGTCACT GCCCACCATG CTGATGATCA 16260 

GGTGGAAACG ATTTTTATGC GCTTGATTCG AGGAACTCGC TTGCGCTATC TATCAGGAAT 16320 

TAAGGAGAAG CAAGTAGTCG GAGAGATAGA AATCATTCGT CCCTTCTTGC ATTTTCAGAA 16380 
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AAAAGACTTT CCATCAATTT TTCACTTTGA AC5ATACATCA AATCAGGAGA ATCATTATTT 


16440 


TCGAAATOGT ATTCGAAATT CTTACTTACC AGAATTGGAA AAAGAAAATC CTCGATTTAG 


16500 


GGATGCAATC 


TTAGGCATTG 


GCAATGAAAT 


TTTAGATTAT GATTTGGCAA TAGCTGAATT 


16560 


ATCTAACAAT ATTAATGTGG AAGATTTACA GCAGTTATTT TCTTACTCTG AGTCTACACA 


16620 


AAGAGTTTTA 


CTTCAAACTT 


ATCTGAATCG 


TTTTCCAGAT TTGAATCTTA CAAAAGCTCA 


16680 


GTTTGCTGAA 


GTTCAGCAGA 


TTTTAAAATC 


TAAAAGCCAG TATCGTCATC 


CGATTAAAAA 


16740 


TGGCTATGAA 


TTGATAAAAG 


AGTACCAACA 


GTTTCAGATT TGTAAAATCA 


GTCCGCAGgC 


16800 


TGATGAAAAG 


GAAGATGAAC 


TTGTGTTACA 


CTATCAAAAT CAGGTAGCTT 


ATCAAGGATA 


16860 


TTTATTTTCT 


TTTGGACTTC 


CATTAGAAGG 


TGAATTAATT CAACAAATAC CTGTTTCACG 


16920 


TGAAACATCC ATACACATTC GTCATCGAAA AACAGGAGAT GTTTTGATTA AAAATGGGCA 


16980 


TAGAAAAAAA 


CTCAGACGTT 


TATTTATTGA 


TTTGAAAATC CCTATGGAAA AGAGAAACTC 


17040 


TGCTCTTATT 


ATTGAGCAAT 


TTGGTGAAAT 


TGTCTCAATT TTGGGAATTG 


CGACCAATAA 


17100 


TTTGA6TAAA AAAACGAAAA 


ATGATATAAT 


GAACACTGTA CTTTATATAG 


AAAAAATAGA 


17160 


TAGGTAAAAA ATGTTAGAAA ACGATATTAA 


AAAAGTCCTC GTTTCACACG 


AT6AAATTAC 


17220 


AGAAGCAGCT 


AAAAAACTAG 


GTGCTCAATT 


AACTAAAGAC TATGCAGGAA 


AAAATCCAAT 


17280 


CTTAGTTGGG 


ATTTTAAAAG 


GATCTATTCC 


TTTTATGGCT GAATTGGTCA 


AACATATTGA 


17340 


TACACATATT 


GAAATGGACT 


TCATGATGGT 


TTCTAGCTAC CATGGTGGAA 


CAGCAAGTAG 


17400 


TGGTGTTATC 


AATATTAAAC 


AAGATGTGAC 


TCAAGATATC AAAGQAAGAC 


ATGTTCTATT 


17460 


TGTAGAAGAT 


ATCATTGATA 


CAGGTCAAAC 


TTTGAAGAAT TTGCGAGATA TGTTTAAAGA 


17520 


AAGAGAAGCA 


GCTTCTGTTA 


AAATTGCAAC 


CTTGTTGGAT AAACCAGAAG 


GACGTGTTGT 


17580 


AGAAATTGAG 


GCAGACTATA 


CTTGCTTTAC 


TATCCCAAAT GAGTTTGTAG 


TAGGTTATGG 


17640 


TTTAGACTAC 


AAAGAAAATT 


ATCGTAATCT 


TCCTTATATT GGAGTATTGA 


AAGAGGAAGT 


17700 


GTATTCAAAT 


TAGAAAGAAT 


AATCTTTAAT 


GAAAAAACAA AATAATGGTT 


TAATTAAAAA 


17760 


TCCTTTTCTA 


TGGTTATTAT 


TTATCTTTTT 


CCTTGTGACA GGATTCCAGT 


ATTTCTATTC 


17820 


TGGGAATAAC 


TCAGGAGGAA 


GTCAGCAAAT 


CAACTATACT GAGTTGGTAC 


AAGAAATTAC 


17880 


CGATGGTAAT 


GTAAAAGAAT 


TAACTTACCA 


ACCAAATGGT AGTGTTATCG 


AAGTTTCTGG 


17940 


TGTCTATAAA AATCCTAAAA CAA6TAAAGA AGAAACAGGT ATTCAGTTTT 


TCACGCCATC 


18000 


TGTTACTAAG GTAGAGAAAT TTACCAGCAC 


TATTCTTCCT GCAGATACTA 


CCGTATCAGA 


18060 


ATTGCAAAAA CTT6CTACTG ACCATAAAGC 


AGAAGTAACT GTTAAGCATG AAAGTTCAAG 


18120 
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TGGTATATGG ATTAATCTAC TCGTATCCAT TGTGCCATTT GGAATTCTAT TCTTCTTCCT 18180 

ATTCTCTATG ATGGGAAATA TGGGAGGAGG CAATGGCCGT AATCCAATGA GTTTTGGACG 18240 

TAGTAAGGCT AAAGCAGCAA ATAAAGAAGA TATTAAAGTA AGATTTTCAG ATGTTGCTGG 18300 

AGCTGAGGAA GAAAAACAAG AACTAGTTGA AGTTGTTGAG TTCTTAAAAG ATCCAAAACG 18360 

ATTCACAAAA CTTGGAGCCC GTATTCCAGC AGGTGTTCTT TTGGAGGGAC CTCCGGGGAC 18420 

AGGTAAAACT TTGCTTGCTA AGGCAGTCGC TGGAGAAGCA GGTGTTCCAT TCTTTAGTAT 18480 

CTCAGGTTCT GACTTTGTAG AAATGTTTGT CGGAGTTGGA GCTAGTCGTG TTCGCTCTCT 18540 

TTTTGAGGAT GCCAAAAAAG CAGCACCAGC TATCATCTTT ATCGATGAAA TTGATGCTGT 18600 

TGGACGTCAA CGTGGAGTCX; GTCTCGGCGG AGGTAATGAC GAACGTGAAC AAACCTTGAA 18660 

CCAACTTTTG ATTGAGATGG ATGGTTTTGA GGGAAATGAA GGGATTATCG TCATCGCTGC 18720 

GACAAACCGT TCAGATGTAC TTGACCCTGC CCTTTTGCGT CX;AGGACGTT TTGATAGAAA 18780 

AGTATTGGTT GGTCGTCCTG ATGTTAAAGG TCGTGAAGCA ATCTTGAAAG TTCACGCTAA 18840 

GAATAAGCCT TTAGCAGAAG ATGTTGATTT GAAATTAGTG GCTCAACAAA CTCCAGGCTT 18900 

TGTTGGTGCT GATTTAGAGA ATGTCTTGAA TGAAGCAGCT TTAGTTGCTG CTCGTCGCAA 18960 

TAAATCGATA ATTGATGCTT CAGATATTGA TGAAGCAGAA GATAGAGTTA TTGCTGGACC 19020 

TTCTAAGAAA GATAAGACAG TTTCACAAAA AGAACGAGAA TTGGTTGCTT ACCATGAGGC 19080 

AGGACATACC ATTGTTGGTC TAGTCTTGTC GAATGCTCGC GTTGTCCATA AGGTTACAAT 19140 

TGTACCACGC GGCCGTGCAG GCGGATACAT GATTGCACTT CCTAAAGAGG ATCAAATGCT 19200 

TCTATCTAAA GAAGATATGA AAGAGCAATT GGCTGGCTTA ATGGGTGGAC GTGTAGCTGA 19260 

A6AAATTATC TTTAATGTCC AAACCACAGG AGCTTCAAAC GACTTTGAAC AAGCGACACA 19320 

AATGGCACGT GCAATGGTTA CAGAGTACGG TATGAGTGAA AAACTTGGCC CAGTACAATA 19J80 

TGAAGGAAAC CATGCTATGC TTGGTGCACA GAGTCCTCAA AAATCAATTT CAGAACAAAC 19440 

AGCTTATGAA ATTGATGAAG AGGTTCGTTC ATTATTAAAT GAGGCACGAA ATAAAGCTGC 19500 

TGAAATTATT CAGTCAAATC GTGAAACTCA CAAGTTAATT GCAGAAGCAT TATTGAAATA 1^560 

CGAAACATTG GATAGTACAC AAATTAAAGC TCTTTACGAA ACAGGAAAGA TGCCTGAAGC 19620 

AGTAGTIAGAG GAATCTCATG CACTATCCTA TGATGAAGTA AAGTCAAAAA TGAATGACGA 19680 

AAAATAACCC TGAGAGA6GC TGGAGCCTCT CTTTTTTGTG CAGTTTAGGA GCTAAAGGGA 19740 

ACAGAATGGA GAAAATGGAA CAAATGTGTT TTCTAATCTG TTAGACTGTA TCTAGAAAGG 19800 

GGAAAATTAT GATTAAAGAA TTGTATGAAG AAGTCCAAGG GACTGTGTAT AAGTGTAGAA 19860 

ATGAATATTA CCTTCATTTA TGGGAATTGT CGGATTGGGA GCAAGAAGGC ATGCTCTGCT 19920 
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TACATGAATT GATTAGTAGA GAAGAAGGAC 


TGGTAGACGA 


TATTCCACGT 


TTAAGGAAAT 


19980 


ATTTCAAGAC 


CAAGTTTCGA AATCGAATTT 


TAGACTATAT 


CCGTAAACAG 


GAAAGTCAGA 


20040 


AGCGTAGATA 


CGATAAAGAA 


CCCTATGAAG 


AAGTGGGTGA 


GATCAGTCAT 


CGTATAAGTG 


2010Q 


AGGGGGGTCT 


CTGGCTAGAT 


GATTATTATC 


TCTTTCATGA 


AACACTAAGA 


GATTATAGAA 


20160 


ACAAACAAAG 


TAAAGAGAAA 


CAAGAAGAAC 


TAGAACGCGT 


CTTAAGCAAT 


GAACGATTTC 


20220 


GAGGGCGTCA 


AAGAGTATTA 


AGAGACTTAC 


GCATTGTGTT 


TAAGGAGTTT 


ACTATCCGTA 


20280 


CCCACTAGTA 


AGTCATGCAA 


AAAAAATGAA 


AAAAATTAGA 


AAAAGTAGTT 


GACAAAGTTT 


20340 


GAAAAGGCTG 


TATAATAGTA 


AGAGTTGAAA 


ATAACAACTC 


AGGTCCGTTG 


GTCAAGGGGT 


20400 


TAAGACACC6 


CCTTTTCACG 


GCGGTAACAC 


GGGTTCGAAT 


CCCX3TACGGA 


CTATGGTATG 


20460 


TTGCGTCAGG 


ACCACTTGAT 


GAAAAAAAGT 


TTAAAAAAAC 


TTAAAAATCT 


TCAAAAAAGT 


20520 


GTTGACAAGC 


GAAAGCAGTT 


GTGATATACT 


AATATAGTTG 


TCGCTTGAGA 


GAAGCAAGTG 


20580 


ACAAAGACCT 


TTGAAAACTG 


AACAAGACGA 


ACCAATGTGC 


AGGGCGCTAC 


AACGTAAGTT 


20640 


GTAGTACTGA ACAATGAAAA 


AAACAATAAA 


TCTGTCAGTG 


ACAGAAATGA 


6TAAGAACTC 


20700 


AAACTTTTTA ATGA6AGTTT 


GATCCTGGCT 


CAGGACGAAC 


GCTGGCGGCG 


TGCCTAATAC 


20760 


ATGCAAGTAG 


AACGCTGAAG 


GAGGAGCTTG 


CTTCTCTGGA 


TGAGTTGCGA 


ACGC3GTV5AGT 




AACGCGTAGG 


TAACCTGCCT 


GGTAGCGGGG 


GATAACTATT 


GGAAACGATA 






CATAAGAGTA 


GATGTTGCAT 


GACATTTGCT 


TAAAAGGTGC 


ACTTGCATCA 


CTACCAGATG 


20940 


GACCTGCGTT 


GTATTAGCTA 


GTTGGTGGGG 


TAACGGCTCA 


CCAAGGCGAC 


GATACATAGC 


21000 


CGACCTGAGA GGGTGATCGG 


CCACACTGGG 


ACTGAGACAC 


GGCCCAGACT 


CCTACGGGAG 


21060 


GCAGCAGTAG 


GGAATCTTCG 


GCAATGGACG 


GAAGTCTGAC 


CGAGCAACGC 


CGCGTGAGTG 


21120 


AAGAAGGTTT 


TCGGATCGTA 


AAGCTCTGTT 


GTAAGAGAAG 


AACGAGTGTG 


AGA6TGGAAA 


21180 


GTTCACACTG 


TGACGGTATC 


TTACCAGAAA 


GGGACGGCTA 


ACTACGTGCC 


AGCAGCCGCG 


21240 


GTAATACX3TA 


GGTCCCGAGC 


GTTGTCCGGA 


TTTATTGGGC 


GTAAAGC6AG 


CGCAGGCGGT 


21300 


TAGATAAGTC 


7GAAGTTAAA 


GGCTGTGGCT 


TAACCATA 






21338 


(2) INFORMATION FOR SEQ ID NO: 21 











<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBZ3NESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TGTTTTTAAA GAGCCGTGTC TGGATAGACT TTCGGACGCA ACGCTCTATT AGATAATGAA 60 

CTGCCTATAC ACAAGATTTC TAACCTTAGT CGACATGAGC TGAAACCTCT TATTTGTTAA 120 

GTAGTTCACA AAATATTATA CACCTATTTT ATGAATAGTC AACTGTCTTT ACAGTAAAAT 180 

TTTAGAAAAT CATGAAAATT TTCTCTTTCT TTCCATTTTA AGTGACATTC AGTCATTCTC 240 

ACATCAAAAA AGCCCAGACG AAATTGTCTG AGCATTCTTT TATCTAGTCG TTTAAGGAAG 3 00 

TTGAGTTCAG TATGTTTAAA GTCTCTGTCC CATCATTTCT TCAACAAACC TTGTTCTTGG 3 60 

AGAAACTCCT TQGCTACTTG CTTTGCTGAC TTGCCTTCAA CACCGACTTG GTAGTTGAGC 420 

TGGCTCATCT GGCTTTCTGT AATCTTACCA GCCAATGTAT TAAGAACTCT TTCCAACTCT 480 

GGGTGTTTCT TGAGAAGAGC TTCTTTCATG AGTGGAGCCC CTTGATAAGG TGGGAAGAGT 540 

TGCTTGTCAT CTTCCAAGAC CTGTAAATCA TAACGCTCCA ATTCCGCATC AGTCGAATAG 600 

GCATCCGTGA TTTGAATATC CCCTGACTGA ATAGCCTGAT AGCGAAGGGC TGGCTCAATG 660 

GTCGCTACAT TGAGATTGAG ACCATACATT GATTGCAAGC CCTTATTTCC ATCTTCACGG 720 

TCGTTAAACT CGAGTGTAAA ACCTGCCTTC AACTGCCCTT CCACTTTTTT CAAGTCTGAA 780 

ATGGTCTTCA AGCCATATTC TTGAGCAATC TTTTTCGGAA CAGCTACAGC ATAGGTGTTT 840 

TGATAAGACA TGGGTTTGAG ATAGGCTAGA TGATCCTGCT TAGCAATGCC ATCACGCGCC 900 

ACCTGATAAA CCTGTTCTGG TTCATGACTC ACCTTGGGTG ATGGTTGAAG CAAACTTTCA 960 

GTCACCGTAC CAGTAAATTC AGGATAGATG TCAATATCGC CTTTTTTCAG AGCTTCATAA 1020 

AGGAAGCTTG TCTTCCCAAA ATTCGGTTTA ACAGTCGCAG TCATGCTGGT ATTTTCTTCA 1080 

ATCAGCAACT TATACATATT GGCCAAAATT TCTGGTTCTG GACCTATTTT CCCAGCAATA 1140 

ACCAAGTTTT CCTTCTCTTT TTGAACCAAA AGAGCTGGAC TATAAGACAG ACCCAGTAAT 1200 

AAAGCCACCA AGGCAAAACC TGAGAAAATC GTCCGTAATT TTGCTTTTTC CATCACTTTT 1260 

AGTAGGAAGT TAAAGGCAAT GGCTAGCACT GCAGAAGAAA GTGCCCCAAT CAAAATCAAA 1320 

CTGGCATTAT TACGGTCAAT TCCCAAAAGA ATAAAGGAAC CTAGTCCCCC TGCACCAATC 1380 

AAGGCCGCCA AGGTTGCCGT ACCGATAATC AAAACAGCTG CCGTCCGAAT CCCAGACATG 1440 

ATAACAGGCA TGGCGAGTGG AATTTCAAAT TTCTTGAGAC GTTCCCATCT GGTCATCCCA 1500 

AAGGCAATCC CA6CCTCTTG CAGGTTCX3GA TCAATTCCCT TCAGCCCAGT GATAGTATTT 1560 

TGCAAAATAG 6GAAAATCGC ATAAATCACT AGAGCTGTCA AAGCCGGCAA GGTCCCAATT 1620 

CCCATCAAAG GGATAAAGAG CCCXTAACAAG GCCAGAGACG GGATGGTCTG GAAAATACCT 1680 

GCAATCTGCA AGACXTCAGTC GGCCAGCTTC TCATGATAGC GAAGAAAAAC AGCCAAGGGA 1740 
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ATCGCAAGCA 


AAATAGCTAG 


TAACAAGGTC AAAAGCGACA 


ACTGCAAATG TTGAGATAGA 


1800 


GCTGTCAACC AATCACTAAA ACGATCCTGA AAAGTT6CAA TTAAATTAGT CATGAACACT 


1860 


ACCTCCAAAC 


AAGTCTGCTA CAAAGTCTGT TGCAGGCGCT 


TTTAAAATTG TCTCGGGATT 


1920 


CGCTACCTGG CGAATTTCTC CATCCTGCAA GACAGCAATA CGGTCCGCCA ACTTCAAGGC 


1980 


TTCATCCGTA TCATGGGTTA CAAAAATCGT TGTCATCCCA AACTCTTTAT GCAATTCTTT 


2040 


TGTCAGAACC 


TGCAACTGTT 


TTCTCGAAAT AGCATCCAAG 


GCCGAAAAGG GTTCATCCAT 


2100 


GAGGAAAATC 


TTGGGCTGAC 


CAATCATAGC TCGGACAATA 


CCGACCCGTT GCTGTTCTCC 


2160 


ACCAGATAAT 


TCACTAGGTA 


AGCGATGCCC ATACTCGGCT 


ACTGGTAAAC CAACCTTAGC 


2220 


CAAAAGCTCT 


TCTGTTTTCT 


TCGTAATTTC TTCCTTGCTC 


CACCCCTTCA TTTCAGGAAT 


2280 


GAGAGCAATA 


TTTTCCGCAA 


CTGTTAGATT TGGAAAAAGA 


GCAATAGCCT GTAAAACATA 


2340 


ACCAGTAGAA 


AGACGAAGTT 


CyiCGCTCATC ATAGTCTTTG 


ATGCGCTTCC CATCCATATA 


2400 


AATATTTCCA 


TCAGTTGGTT 


CCAAAAGACG GTTAATCATC 


TTGAGCATGG TCGTCTTACC 


2460 


TGACCCA6AA 


GGCCCTACTA 


AAACCATAAA TTCCCCATCC 


TCAATCTGTA AGTTGACATC 


2520 


TCTCAAGACA 


TCCTTTTCTG 


TGTAGCGCAG TGCTACATTT 


TTGTATTCAA TCATTCTTTG 


2580 


TCCTCAATTT 


AAAACTTCCC 


TCGATTGGTC AAGTCTTCTA 


CCTTAGGCAT AACTTCCTTA 


2640 


TTATCCCAAT 


GCTCCACAAT 


TTTCCCGTTC TCTAAACGGA 


AGATATCGTA CTGGGCATAA 


2700 


GCAACGCCAT 


CAATCTGAGT 


CTGACCATAG CTAACCACAT 


AGTTTCCTTG TCCTAAGAGT 


2760 


TGGAAAACAA AGTCAAAAGT GACACTATAT TCAGCCACAT 


AQTTTTTATA AGCAGCACTT 


2820 


CCTTGTCCAA 


TATCATGATT 


ATGCTGAATC AAATCGTCTG 


CCACATAATC ACTCCACTGC 


2880 


TCTAGCTCCC 


CATTTTGGAA AATTTCTGTC AAGAAACGGC 


GAACCAGCTT TTTATTTTCT 


2940 


GCTTTCTTAT 


CCAAATCCTT 


GATTTCAAAA TCTCCAAAAA 


TTTGATCTAG TTGGTCATTT 


3000 


TCAGGTGTTC 


GATAGTAGTC 


AATGACATCC CAATGCTCAA 


CAATACAACC ATTCTCATCC 


3060 


TCACGGAAAG 


TATCCGTCGT 


CACCCATTGA GCTTCTCCAC 


CATTCAGATA TTGATGAACA 


3120 


TGAACAAAGA 


CCAGATTGCC 


ATCCTCAATG GTGCGGACAA 


TCTTAATCTG ACGCTCTGGA 


3180 


TGACGCTCAA 


AGAAATCTGC 


AAAGAAGGCT GCAAATCCTT 


CTTTCCCGTC AGGAACACCT 


3240 


GTCGAATGTT 


GGATATAGGT ATCCCCTACA GACTGGGCTT 


GAGCCTCAGC AACTCGTCCG 


3300 


TCTTGAATGG 


CATGGATGTA TAGGTTGTGA GCATTTTTCA 


CTTGTTGTGA CATATTCTAA 


3360 


ACCTCATTTC CCTTCTCTTT 


CAGATTCGCC AAAATTCTTT 


CTTGAAAACC TTCAAATTGG 


3420 


T6AATTTCTT 


CCTCTGAAAA TCCTTTGTAA AAGATAGTAT CCAATTTCTG ACTGACACGA 


3480 



wo 98/18931 



PCT/US97/19S88 



272 

TGCCCCACTT CTTTCTGGGA CTTGCCTAAC TCCGITAAAA CTAAATACTT CTTACGCTTG 3540 

TCTTTTCCAC ACGGACTAAC AATTACAAGC TTTTGTTCCT CTAGCTTTTT TATCATAGTC 3600 

GTCAGCGTAT TATTCGCAAG TCCAGTCGCA AGCGCGATAT CTGTCGCAGT TGCGCAGCCA 3660 

GTTTCACTAT TCCATAAAAC CGCTAAAATC TTGCCCTGTT CACCCCTATA AAGAGCCTCA 3720 

GGATCTTGAC TCAGTAACTT TTGAAAAATC CGCCCATTCA ACAAACGAAT ATGATGGGCT 3780 

AGCAAATGAC CATCTTTCAT AACACCTCCA ATTTATTTCG ATATCGAAAT GAATAAAACA 3840 

ATTGTAACAC TCATCGTTCT AACTGTCAAC TATTTCGATT TAGAAATAAT TTTTGATAAT 3900 

TATCCACACC ACCATACTCC GGCTCAACTA ACTTTTAACG AGAGTTTCTA AACTCCTTCG 3960 

TCCTCCAGTC TACAAAAGCC TTCCATTCGT ACTATCCTAT ATTTTATGAG GGGACACATT 4020 

TTTCCTATCA GACCATTTAT TTTAAAGATA GAAGTAAATC ATAATTGCTT CCATCTGTTC 4080 

TTTTATAGTA TATTGAAGTT AGACTAGAGC ACTGTATCTT CTAAAACATT GATAGAAAGC 4140 

GATTTGAATT TCCCAATCAA TTTGTTCGTA TTTATAGCAT TTCGAAACTG GAATAGGACA 4200 

CCATGACTGC TAAAAGATTT CTATAAATTC ATTTAATTTC CTCAATCAAT TTGTTCATAT 4260 

CTTATTTCAT TCCGCTATAA TTTCACCTTA CCCTATCTTT TTCGTAGCAC CCTTCAAACA 4320 

GCCTATCCCC TACCGTTTGA CGATTCCTCA CTTCGCTCCA CTTCCATTAC AGAAGTTTCT 4380 

TCACTACTAT GGGCTCGGCT GACTTCTCAT GATTCCTTGT TACTACTATT TGAACGCTCA 4440 

CGAGATAGAT CTTACAAAAA ATGCTTTGAT CCACAATGGA ATCAAAGCAT TTTAAAGAGT 4500 

TCCTCATACA TAAGCGCAGA AGTCGCAGTT CCTCTGTACT TGGCTTCTTC TCTTTTGACA 4560 

AAGCGAGCCA AGTTGAGCAA CTCAGGTGCT GGATGTTTGG GATTTAGGAG CAATTCACGA 4620 

TTGACCAGGC CTGAGAGACG AACTGCXTTGC AATTGCTCAT TTGTAGTAGG CAGTTTTTTA 4680 

GTAGTCTCTA GGAGAGCAGC AACTAAATCT TCACTCAAAT CATGTCGAGC ATGATTGTAA 4740 

AGATCTTTTA TAAGGCTTTC TAGGTTTGGT TCTACCATCC CTACCACCTC CCTTATGGTT 4800 

TAATAATGTT TAATCAAATC AACCGTTGAA CGATCCAATT TCTTCACCAA GGCTTGTAAG 4860 

AAAGCTTGCG CTTCTAGGAA GTCATCCATT GCATAGAGGG TTTGGTGAGA ATGGATATAA 4920 

CGAGCGCAGA CACCGATAGT TGTTGATGGG ACACCACCAT TTTTCAGATG AGCTGCACCT 4980 

CCATCTGTTC CGCCTTTACC ACAGTAGTAT TGGTACTTGA TACCAGCTTC TTCAGCCGTT 5040 

GTCAAAAGGA AATCCTTCAT CCCTGGGAGA AGCAAGTGAC CTGGATCATA GAAACGAATC 5X00 

AAGGTTCCAT CTCCAATCTT GCCTTGACCA CCGTAGACAT CACCTGCTGG TGAGCAATCA 5160 

ACTGCGAG6A AGACTTCTGG GTCAAACTTG GTTGTAGAGG TATGAGCGCC ACGCAGACCA 5220 

ACTTCT T C T T 6GACGTTAGA ACCCAGATAG AGTTCATTGC CGAGTTTTTG ACCCGATAAA 5280 
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GCTTCAGCTA GCTCGCTTAC CATGAGGACA CCGTAGCGGT TATCCCAAGC TTTTGAGATG 5340 

ATATTTTTTT CATTGGCTGT CAAAATTGCA GAACTATCTG GTACAATGGT ATCACCAGGA 5400 

CGGATGCCAA AACTTTCTGC CTCAGCCTTG TCCGCAAAAC CACCATCAAA AACGATATCG 5460 

GCAATGGCTG GCATGGTTGG TCCCCCCTTT CCACGAGTCA AATGCGGAGG AACAGAACCT 5520 

GAAATCACAG GAATTTCATG ACCATCACGA GTCAAGAGTT TGAAACGTTG GCTGCTAACC 5580 

ACCATGGGGT TCCAGCCACC GATTTCTACG ACACGGAAGG TACCATCTGG CTTGATTTCG 5640 

CTGACCATAA AACCAACTTC GTCCATATGA GAAGCGACCA AGACGCGCGG TGCATCCACA 5700 

GCTTCTGAAT GTTTGATACC AAAAATACCA CCCAAGCCAT CTGTCACCAC TTCATCCACA 5760 

TGCGGTGTCA ACTTTTCACG AAGATAAGCA CGGACAGGCX; CTTCATGACC TGAGACTGCA 5820 

GCAAGTTCTG TTACTTCTTT AATTTTTGAA AATAATGTTG TCATTTCAGT TCCTTCTTTC 5880 

TTTCATCCAT TTTACCACTT TTTATAGGAG AAGGATAGTG GGAAGGTGGA TTTCTAAGTT 5940 

AGTATCTTAG TCCTGCTCTA TCTTAGAAAA GGATAGTATT CTCTTGCATG TAGTGCAAAA 6000 

TCTAGTAAAC ATTCCAAAAT TAACTCGAAT ATTTATTTCC AAACAAAAAA ACAATACACC 6060 

ATCAAAGTTG TTTGGATTTT TCATGAAATT TACAGAAAAT AGTTGACTTC CCTTTCTTCT 6120 

TTCTTTAAAT ATATAGTTGG TTGAGTTTGG AATAGTACGC TGTAGCTGCT AAAACATTTC 6180 

TAGAAATTAA TTTGACTTTC CTAATAGAGT TGTTCATATC TTATTTCAAT TTACTATAGT 6240 

ACAAAACTAG AAAAGGAAAA AATCATGACC AGG 6273 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ACAACCTTTT TCAAAAACTC ACCTTGGTAC GGAGATGTTT TGCTTTCTGC TATTATTTTC 60 

GGTTATATTC ATATCAATTT TGCTTTAACT CCTCTTGCTT TTTTCATTTA TGCTAGTGGA 120 

GGTCTTATTT TAGCTCTATT GTATCGCATG ACTAAAAATC TCTACTATCC AATACTAGTT 180 

CATATTCTCA TTAATATCAC TGCCTTCTGG GATGTGTGGT TGCTCCTATT TTCAGGAAGT 240 

TAGCTTACTA AAATAATGTC GGAACTTTCC GGCATTTTCT TTTTTCACAA ATAGTCAACG 300 

TTTTTCTTTT CGATATTGTA GTGGTGTGTA TCCAGTTATT TTTTTGAATT GATTTTGAAA 360 
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ATAAGGTTGA 


CTTGAGAAAG 


GCAGATAGTG 


AAGATAGTTA 


AGAAGAATAG 


GATGTTCTTT 


420 


TTTCCTTTTT 


GGAAAACTTC 


TAAAATATGG 


TATAATGAAA 


AGATAAAGAA 


GTTGGGGGTA 


480 


GAAGATGAAC 


ATTCAACAAT 


TACGCTATGT 


TGTGGCTATT 


GCCAATAGTG 


GTACTTTTCG 


540 


TGAAGCTGCT 


GAAAAGATGT 


ATGTTAGTCA 


GCCGAGTCTG 


TCTATTTCTG 


TTCGTGATTT 


600 


GGAAAAAGAG 


TTGGGCTTTA 


AGATTTTCCG 


TCGGACCAGC 


TCAGGGACTT 


TCTTGACCCG 


660 


TCGTGGGATG 


GAATTTTATG 


AAAAATCGCA 


AGAATTGGTT 


AAAGGATTTG 


ATATTTTTCA 


720 


AAATCAGTAT 


GCCAATCCTG 


AAGAAGAAAA 


AGATGAATTT 


TCTGTTGCTA 


GCCAGCACTA 


780 


TGACTTCTTO 


CCACCAACTA 


TTACGGCCTT 


TTCAGAGCGC 


TATCCTGACT 


ATAAGAACTT 


840 


CCGTATTTTT 


GAATCAACTA 


CTGTTCAAAT 


ATTAGATGAA 


GTGGCGCAAG 


G6CATA6TGA 


900 


GATTGGGATT 


ATCTACCTCA 


ACAATCAAAA 


TAAAAAGGGG 


ATTATGCAAC 


GGGTTGAAAA 


960 


ATTAGGTCTG 


GAGGTCATCG 


AATTGATTCC 


TTTCCATACC 


CATATTTATC 


TCCGTGAGGG 


1020 


TCATCCTTTA 


GCCCAGAAAG 


AGGAATTAGT 


CATGGAGGAT 


TTAGCGGATT 


TACCAACGGT 


1080 


TCGTTTCACT 


CAAGAGAAAG 


ACGAGTACCT 


TTATTATTCA 


6AG7VACTTTG 


TCGATACCAG 


1X40 


CGCTAGCTCA 


CAGATGTTTA 


ATGTGACAGA 


CCGTGCCACC 


TTGAATGGTA 


TTTTGGAGCG 


1200 


GACGGACGCC 


TATGCGACAG 


GTTCTGGATT 


TTTAGATAGT 


GACAGTGTTA 


ATGGCATTAC 


1260 


AGTTATTCGT 


CTCAAGGATA 


ACCTAGATAA 


CCGCATGGTC 


TATGTTAAAC 


GTGA^^GAAGT 


1320 


GGAGCTTAGT 


CAAGCTGGGA 


CTCTCTTCGT 


AGAAGTCATG 


CAAGAATATT 


TTGATCAAAA 


1380 


GAGGAAATCA 


TGAAAAAAAG 


AGCAATAGTG 


GCAGTCATTG 


TACTGCTTTT 


GATTGG6CTG 


1440 


GATCAGTTGG TCAAATCCTA TATCGTCCAG 


CAGATTCCAC TGGGTGAAGT 


GCGCTCCTGG 


1500 


ATCCCCAATT 


TCGTTAGCTT 


GACCTACCTG 


CAAAATCGAG 


GTGCAGCCTT 


TTCTATCTTA 


1560 


CAAGATCAGC 


AGCTGTTATT 


CGCTGTCATT 


ACTCTGGTTG 


TCGTGATAGG 


TGCCATTTGG 


1620 


TATTTACATA AACACATGGA GGACTCATTC 


TGGATGGTCT 


TGGGTTTGAC 


TCTAATAATC 


1680 


GCGGGTGGTC 


TTGGAAACTT 


TATTGACAGG 


GTCAGTCAGG 


GCTTTGTTGT 


GGATATGTTC 


1740 


CACCTTGACT 


TTATCAACTT 


TGCAATTTTC 


AATGTGGCAG 


ATAGCTATCT 


GACGGTTGGA 


1800 


GTGATTATTT 


TATTGATTGC 


AATGCTAAAA 


GAGGAAATAA 


ATGGAAATTA 


AAATTGAAAC 


1860 




CGTTTGGATA 


AGGCTTTGTC 


AGATTTGTCA 


GAATTATCAC 


GTAGTCTCGC 


1920 


GAATGAACAA 


ATTAAATCAG 


GCCAGGTCTT 


GGTCAATGGT 


CAAGTCAAGA 


AAGCTAAATA 


1980 


CACAGTCCAA 


GAGGGTGATG 


TCGTCACTTA 


CCATGTGCCA 


GAACCAGAGG 


TATTAGAGTA 


3040 


TGTGGCTGAG 


GATCTTCCGC 


TAGAAATAGT 


CTACCAA6AT 


GAGGATGTGG 


CTGTCGTTAA 


2100 


CAAACCTCAG 


GQAATGGTTG 


TGCACCCGAG 


TGCTGGTCAT 


ACCAGT6GAA 


CCCTAGTAAA 


2160 
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TGCCCTCATG TATCATATTA AGGACTTGTC GGGTATCAAT GGGGTTCTGC GTCCAGGGAT 2220 

TGTTCACCGT ATTGATAAGG ATACGTCAGG TCTTCTCAT6 ATTGCTAAAA ACGATGATGC 2280 

GCATCTAGCA CTTGCCCAAG AACTCAAGGA TAAAAAGTCT CTCCGCAAAT ATTGGGCGAT 2340 

TGTTCATGGA AATCTACCTA ATGATCGTGG T6TAATTGAA GCGCCGATTG GCCGGAGTGA 2400 

AAAAGACCGT AAGAAACAGG CTGTAACTGC TAAAGGGAAG CCTGCAGTGA CGCGTTTTCA 2460 

CGTCTTGGAA CGCTTTGGCG ATTATAGCTT AGTAGAGTTG CAACTGGAGA CAGGGCGCAC 2520 

TCATCAAATC CGTGTCCACA TGGCTTATAT CGGCCATCCA GTCGCTGGTG ATGAGGTCTA 2580 

TGGTCCTCGC AAGACTTTGA AAGGACATGG ACAATTTCTT CATGCCAAGA CTTTAGGTTT 2640 

TACTCATCCG AGAACAGGTA AGACCTTGGA ATTTAAAGCA GATATCCCAG AGATTTTTAA 2700 

GGAAACCTTG 6AGAGATTGA GAAAGTAAGA ATGAAAAAGA AATTAACTAG TTTAGCACTT 2760 

6TAGGCGCTT TTTTAGGTTT GTCATGGTAT GGGAATGTTC AGGCTCAAGA AAGTTCAGGA 2820 

AATAAAATCC ACTTTATCAA TGTTCAAGAA GGTGGCAGTG ATGCGATTAT TCTTGAAAGC 2880 

AATGGACATT TTGCCATGGT GGATACAGGA GAAGATTATG ATTTCCCAGA TGGAAGTGAT 2940 

TCTCGCTATC CATGGAGAGA AGGAATTGAA ACGTCTTATA AGCATGTTCT AACAGACCGT 3000 

GTCTTTCGTC GTTTGAAGGA ATTGGGTGTC CAAAAACTTG ATTTTATTTT GGTGACCCAT 3060 

ACCCACAGTG ATCATATTGG AAATGTTGAT GAATTACTGT CTACCTATCC AGTTGACCGA 3120 

GTCTATCTTA AGAAATATAG TGATAGTCGT ATTACTAATT CTGAACGTCT ATGGGATAAT 3180 

CTGTATGGCT ATGATAAGGT TTTACAGACT GCTGCAGAAA AAGGTGTTTC AGTTATTCAA 3240 

AATATCACAC AAGGGGATGC TCATTTTCAG TTTGGGGACA TGGATATTCA GCTCTATAAT 3300 

TATGAAAATG AAACTGATTC ATCGGGTGAA TTAAAGAAAA TTTGGGATGA CAATTCCAAT 3360 

TCCTTGATTA GCGTGGTGAA AGTCAAT6GC AA6AAAATTT ACCTTGGGGG CGATTTAGAT 3420 

AATGTTCATG GAGCAGAAGA CAAGTATGGT CCTCTCATTG GAAAAGTTGA TTTGATGAAG 3480 

TTTAATCATC ACCATGATAC CAACAAATCA AATACCAAGG ATTTCATTAA AAATTTGAGT 3540 

CCGAGTTTGA TTGTTCAAAC TTCGGATAGT CTACCTTGGA AAAATGGTGT TGATAGTGAG 3600 

TATGTTAATT GGCTCAAAGA ACGAGGAATT GAGAGAATCA ACGCAGCCAG CAAAGACTAT 3660 

GATGCAACAG TTTTTGATAT TCGAAAAGAC GGTTTTGTCA ATATTTCAAC ATCCTACAAG 3720 

CCGATTCCAA GTTTTCAAGC TGGTTGGCAT AAGAGTGCAT ATGGGAACTG GTGGTATCAA 3780 

GCGCCTGATT CTACAGGAGA GTATGCT6TC GGTTGGAATG AAATCGAAGG TGAATGGTAT 3840 

TACTTTAACC AAACGGGTAT CTTGTTACAG AATCAATGGA AAAAATGGAA CAATCATTGG 3900 
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TTCTATTTGA CAGACTCTGG TGCTTCTGCT AAAAATTGGA AGAAAATCGC TGGAATCTGG 3960 

TATTATTTTA ACAAAGAAAA CCAGATGGAA ATTGGTTGGA TTCAAGATAA AGAGCAGTGG 4020 

TATTATTTGG ATGTTGATGG TTCTATGAAG ACAGGATGGC TTCAATATAT GGGGCAATGG 4080 

TATTACTTTG CTCCATCAGG GGAAATGAAA ATGGGCTGGG TAAAAGATAA AGAAACCTGG 4140 

TACTATATGG ATTCTACTGG TGTCATGAAG ACAGGTGAGA TAGAAGTTGC TGGTCAACAT 4200 

TATTATCTGG AAGATTCAGG AGCTATGAAG CAAGGCTGGC ATAAAAAGGC AAATGATTGG 42 60 

TATTTCTACA AGACAGACGG TTCACGAGCT GTGGGTTGGA TCAAGGACAA GGATAAATGG 4320 

TACTTCTTGA AAGAAAATGG TCAATTACTT GT6AACGGTA AGACACCAGA ACGTTATACT 4380 

GTGGATTCAA GTGGTGCCTG GTTAGTGGAT GTTTCGATCG AGAAATCTGC TACAATTAAA 4440 

ACTACAAGTC ATTCA6AAAT AAAAGAATCC AAAGAAGTAG TGAAAAAGGA TCTTGAAAAT 4500 

AAAGAAACGA GTCAACATGA AAGTGTTACA AATTTTTCAA CTAGTCAAGA TTTGACATCC 4560 

TCAACTTCAC AAAGCTCTGA AACGAGTGTA AACAAATCGG AATCAGAACA GTAGTAGAAA 4620 

AGAAGGTTTT AGGGCCTTCT TTTTCCTATC AACTCTTTTC TATTTCCTGT TATtCATGTT 4680 

ATAATGGATA AATATGAATA ATCGGAGTGA GACTATGAAA TACAAACGGA TTGTCTTTAA 4740 

GGTGGGTACT TCTTCTCTGA CAAATGAGGA TGGAAGTTTA TCACGTAGTA AGGTAAAGGA 4800 

TATTACCCAG CAGTTGGCTA TGCTGCACGA GGCTGGTCAT GAGTTGATTT TGGTGTCTTC 4860 

AGGTGCCATT GCGGCTGGTT TTGGAGCCTT AGGATTTAAA AAGCX3TCCGA CTAAGATTGC 4920 

TGATAAACAG GCTTCAGCAG CGGTAGGGCA GGGGCTTTTG TTGGAAGAAT ATACAACCAA 4980 

TCTTCTCTTG CGTCAAATCG TTTCTGCACA AATCTTGCTG ACCCAAGATG ACTTTGTGGA 5040 
TAAGCGTCGT TATAAAAATG CCCATCAGGC TTTGTCGGTT TTGCTCAACC GTGGGGCAAT - 5100 

TCCTATCATC AATGAGAATG ATAGTGTCGT TATTGATGAG CTCAAGGTTG GGGACAATGA 5160 

CACTCTAAGT GCTCAAGTAG CGGCGATGGT CCAAGCAGAC CTTTTAGTTT TCTTGACAGA 5220 

TGTGGACGGT CTCTATACTG GAAATCCTAA TTCAGATCCA AGAGCCAAAC GCTTGGAGAG 5280 

AATCGAGACC ATCAATCGTG AGATTATTGA TATGGCTGGT GGAGCTGGTT CGTCAAACGG 5340 

AACTGGGGGT ATGTTAACCA AAATCAAGGC TGCAACTATC GCGACGGAAT CAGGAGTTCC 5400 

TGTTTATATC TGCTCATCCT TGAAATCAGA TTCCATGATT GAGGCGGCAG AGGAGACCGA 5460 

GGATGGTTCT TACTTTGTTG CTCAAGAGAA GGGGCTTCGT ACCCA6AAAC AATGGCTTGC 5520 

CTTCTATGCT CAGAGTCAAG GTTCTATTTG GGTTGATAAA GGGGCTGCGG AAGCTCTCTC 5580 

TCAATATGGA AAGAGTCTTC TCTTATCTGG TATCGTTGAA GCAGAAGGAG TCTTTTCTTA 5640 

CGGTGATATC GTGACAGTAT TTGACAAGGA AA6TGGAAAA TCACTTGGAA AAGGACGCGT 5700 
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GCAATTTGGA GCATCTGCTT TGGAGGATAT GTTGCGTTCT CAAAAAGCCA AGGGTGTCTT 5760 

GATTTACCGT GACGACTGGA TTTCCATTAC TCCTGAAATC CAACTACTTT TTACAGAATT 5820 

TTAGAGGTAA ACTATGGTGA GTAGACAAGA ACAATTTGAA CAGGTACAGG CTGTTAAAAA 5880 

ATCGATTAAC ACAGCTAGTG AAGAAGTGAA AAACCAAGCC TTGCTAGCCA TGGCTGATCA 5940 

CTTAGTGGCT GCTACTGAGG AAATTTTAGC GGCTAATGCC CTCGATATGG CAGCGGCTAA 6000 

GGGGAAAATC TCAGATGTGA TGTTGGATCG TCTTTATTTG GATGCAGATC GTATAGAAGC 6060 

GATGGCAAGA GGAATTCGTG AAGTGGTTGC CTTACCAGAT CCAATCGGTG AAGTTTTAGA 6120 

AACAAGTCAG CTTGAAAATG GTTTGGTTAT CACAAAAAAA CGTGTAGCTA TGGGTGTCAT 6180 

CGGTATTATC TATGAAAQCC GTCCAAATGT GACGTCTGAT GCGGCTGCTT TGACTCTTAA 6240 

GAGTGGAAAT GCGGTTGTTC TTCGTAGTGG TAAGGATGCC TATCAAACAA CCCATGCCAT 6300 

TGTCACAGCC TTGAAGAAGG GCTTGGAGAC GACTACTATT CATCCAAATG TGATTCAACT 6360 

GGTGGAGGAT ACTAGCCGTG AAAGTAGTTA TGCTATGATG AAGGCCAAGG GCTATCTAGA 6420 

CCTTCTCATT CCTCGTGGAG GAGCTGGCTT GATCAATGCA GTGGTTGAGA ATGCGATTGT 6480 

ACCTGTTATC GAGACAGGGA CTGGGATTGT CCATGTCTAT GTGGATAAGG ATGCAGACGA 6540 

AGACAAGGCG CTGTCTATCA TCAACAATGC TAAAACCAGT CGTCCTTCTG TTTGTAATGC 6600 

CATGGAGGTT CTGCTGGTTC ATGAAAACAA GGCAGCAAGC TTCCTTCCTC GCTTGGAGCA 6660 

AGTGTTGGTT GCAGAGCGTA AGGAAGCTGG ACTGGAACCA ATTCAATTCC GCCTAGATAG 6720 

CAAAGCAAGC CAGTTTGTTT CAGGTCAAGC AGCTGAGACC CAAGACTTTG ACACCGAGTT 6780 

TTTAGACTAT GTCCTTGCTG TTAAGGTTGT GAGCAGTTTA GAAGAAGCGG TTGCGCACAT 6840 

TGAATCCCAC AGCACCCATC ATTCGGATGC TATTGTGACG GAAAATGCTG AAGCTGCAGC 6900 

ATACTTTACA GATCAAGTGG ACTCTGCAGC GGTGTATGTT AATGCCTCAA CTCGTTTCAC 6960 

AGATGGAGGA CAATTTGGTC TTGGTTGTGA AATGGGGATT TCTACTCAGA AATTGCACGC 7020 

GCGTGGTCCC ATGGGCTT6A AAGAGTTGAC CAGCTACAAG TATGTGGTTG CCGGTGATGG 7080 

GCAGATAAGG GAGTAAGAGA TGAAGATTGG ATTTATCGGT TTGGGGAATA TGGGTGCTAG 7140 

CTTGGCAAAA TCTGTCTTGC AGACTAGGAC GTCAGATGAG ATTCTCCTTG CCAATCGTAG 7200 

TCAAGCTAAG GTAGATGCTT TCATTGCAGA CTTTGGTGGT CAGGCTTCCA GCAATGAAGA 7260 

AATGTTTGCA GAAGCAGATG TGATTTTTCT AGGAGTTAAG CCTGCTCAGT TTTCTGAACT 7320 

GCTTTCTCAA TACCAGACCA TCCTTGAAAA AAGAGAAAGT CTTCTTTTGA TTTCGATGGC 7380 

AGCTGGATTG ACCTTAGAAA AACTAGCAAG TCTTATCCCA AGTCAACACC GAATTATTCG 7440 
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TATGATGCCT 


AATACCCCTG 


CTTCTATCGG GCAAGGAGTG ATTAGTTATG 


CCTTGTCTCC 


7500 


TAATTGCAGG 


GCTGAGGACA GTGAGCTCTT TTATCAGCTT TTAGCCAAGG 


CTGGTCTCTT 


7560 


GGTTGAACTA 


GGAGAAAGTT 


TAATCGATGC AGCGACAGGT CTTGCAGGTT GTGGACCAGC 


7620 


CTTTGTCTAT 


CTTTTTATCG 


AGGCCTTGGC AGATGCAGGT GTTCAGACAG 


GATTACCACG 


7680 


AGAAATAGCA 


TTGAAAATGG 


CAGCACAAAC TGTGGTAGGA GCTGGGCAAT 


TGGTCCTTGA 


7740 


AAGTCAGCAA 


CATCCTGGAG 


TATTGAAAGA CCAAGTCTGT AGCCCAGGCG 


GTTCGACTAT 


7800 


CGCTGGTGTA 


GCAAGCCTT^ AAGCGCATGC TTTCCGAGGA ACAGTCATGG ATGCAGTTCA 


7860 


TCAAGCCTAC 


AAACGAACAC 


AAGAACTAGG TAAATAAGAG GTAGTTTTGA 


CTGCCTCTTT 


7920 


TATGGTGGCT 


GAAATGAGAA GACACAAAAA GATTGTCACA AACCCCTATT TTTTTGATAG 


7980 


AATAGAAGTA 


GTAAAAAAGA AATGAGTTAG ACATGTCAAA AGGATTTTTA GTCTCTCTTG 


8040 


AGGGACCAGA 


GGGAGCAGGC AAGACCAGTG TTTTAGAGGC TCTGCTACCA 


ATTTTAGAGG 


8100 


AAAAAGGAGT 


AGAGGTGTTG ACGACCCGTG AACCTGGCGG AGTCTTGATT GGGGAGAAGA 


8160 


TTCGGGAAGT 


GATTTTGGAT 


CCAAGTCATA CTCAGATGGA TGCTAAAACA 


GAGCTACTTC 


8220 


TCTATATTGC 


CAGTCGCAGA 


CAGCATTTGG TGGAAAAAGT TCTTCCAGCC 


CTTGAAGCTG 


8280 


GCAAGTTGGT 


CATCATGGAT 


CGTTTTATCG ATAGTTCTGT TGCCTATCAG 


GGATTTGGTC . 


8340 


GTG6CTTAGA 


TATTGAAGCC 


ATTGACTGGC TCAATCAGTT TGCGACAGAT 


GGCCTCAAAC 


8400 


CCGATTTGAC 


ACTCTATTTT 


GACATCGAGG TGGAAGAAGG GCTGGCTCGT 


ATTGCTGCTA 


8460 


ATAGTGACCG 


CGAGGTTAAT 


CGTTTGGATT TGGAAGGGTT GGACTTGCAT 


AAAAAAGTTC 


8520 


GTCAAGGCTA 


CCTTTCTCTT 


CTGGATAAAG AGGGAAATCG CATTGTCAAG 


ATTGATGCTA 


8580 


GTCrCCCTPT 


GGAGCAAGTT 


GTGGAAACTA CCAAGGCTGT CTTGTTTGAC 


GGAATGGGCT 


8640 


TGGCCAAATG 


AAACAAGATC 


AACTAAAGGC TTGGCAACCA GCTCAGTTTG 


ACCGTTTTGT 


8700 


CCGTATCTTA 


GAACAAGACC 


AGCTCAATCA CGCCTATCTC TTTTCAGGTT 


TCTTTGAAAG 


8760 


CTTGGAAATG 


GCX3CAATTTT 


TAGCTAAGAG CCTCTTTTGT ACGGATAAAG 


TTGGCGTCTT 


8820 


ACCATGTGAG 


AAATGCCGAA 


GTTGCAAGCT GATTGAACAG GGAGAATTTC 


CCGATGTCAC 


8880 


CTTGATTAAA 


CCAGTTAATC AGGTCATTAA GACGGAACGC ATTCGAGAAT 


TGGTGGGTCA 


8940 


GTTTTCTCAA 


6CAGGGATTG AAAGCCAGCA ACAGGTCTTT ATCATCGA6C AA6CGGATAA 


9000 


AATGCATCCC 


AACGCAGCCA ATTCTCTGCT CAAGGTCATC GAAGAACCCC 


AGAGTGAAGT 


9060 


TTATATTTTC 


TTCTTGACTA GCGATGAGGA AAAGATGTTA CCGACAATCC GAAGTCGGAC 


9120 


TCAGATCTTC 


CACTTTAAAA AGCAAGAAGA AAAACTTATC TTACTCTTAG AACAAATGGG 


9180 


ACTTGTTAAG 


AAAAAAGCGA CTCTTTTAGC TAAGTTTAGT CAATCGCGAG CTGAAGCAGA 


9240 
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AAAGTTGGCT AATCAGGCAA GTTTTTGGAC CTTGGTCGAT GAAAGTGAAC GCCTGCTGAC 9300 

TTGGTTAGTA GCTAAGAAAA AAGAAAGTTA TCTACAGGTT GCCAAATTAG CCAACTTGGC 9360 

AGATGATAAG GAAAAACAG6 ATCAGGTTTT ACGGATTCTT GAAGTTCTCT GTGGGCAGGA 9420 

CCTCTTGCAG GTAA6AGTAA GAGTGATTCT ACAAGATTTA CTAGAAGCTA GAAAAATGTG 9480 

GCAAGCTAAT GTCAGCTTTC AAAATGCCAT GGAATATCTG GTCTTGAAAG AAATATAAAC 9540 

TCAAAAATGA ATGATAAAGA AAGGAAAGGG CTGTTTTATG GACAAAAAAG AATTATTTGA 9600 

CGCGCTGGAT GATTTTTCCC AACAATTATT GGTAACCTTA GCCGATGTGG AAGCCATCAA 9660 

GAAAAATCTC AAGAGCCTGG TAGAGGAAAA TACAGCTCTT CGCTTGGAAA ATAGTAAGTT 9720 

GCGAGAACGC TTGGGTGAGG TGGAAGCAGA T6CTCCTGTC AAGGCCAAGC ATGTTCXSTGA 9780 

AAGTGTCCGT CGCATTTACC GTGA7GGATT TCACGTATGT AATGATTTTT ATGGACAACG 9840 

TCGAGAGCAG GACGAGGAAT GTATGTTTTG TGACGAGTTG CTATACAGG6 AGTAGGCATG 9900 

CAGATTCAAA AAAGTTTTAA GGGGCAGTCT CCCTATGGCA AGCTGTATCT AGTGGCAACG 9960 

CCGATTGGCA ATCTAGATGA TATGACTTTT CGTGCTATCC AGACCTTGAA AGAAGTGGAC 10020 

TGGATTGCTG CTGAGGATAC GCGCAATACA GGGCTTTTGC TCAAGCATTT TGACATTTCC 10080 

ACCAAGCAGA TCAGTTTTCA TGAGCACAAT GCCAAGGAAA AAATTCCTGA TTTGATTGGT 10140 

TTCTTGAAAG CAGGGCAAAG TATTGCTCAG GTCTCTGATG CCGGTTTGCC TAGCATTTCA 10200 

GACCCTGGTC ATGATTTAGT TAAGGCAGCT ATTGAGGAAG AAATTGCAGT TGTGACAGTT 10260 

CCAGGTGCCT CTGCAGGAAT TTCTGCCTTG ATTGCCAGTO GTTTAGCGCC ACAGCCACAT 10320 

ATCTTTTACG GTTTTTTACC 6A6AAAATCA GGTCAGCAGA AGCAATTTTT TGGCTTGAAA 10380 

AAAGATTATC CTGAAACACA GATTTTTTAT GAATCACCTC ATCGTGTAGC AGACACX3TTG 10440 

GAAAATATGT TAGAAGTCTA CGGTGACCGC 'TCCGTTGTCT TGGTCAGGGA ATTGACCAAA 10500 

ATCTATGAAG AATACCAAC6 AGGTACTATC TCTGAGTTAT TAGAAAGCAT TGCTGAAACG 10560 

CCACTCAAGG GCGAATGTCT TCTCATTGTT GAGGGTGCCA GTCAGGGTGT GGAGGAAAAG 10620 

GACGAGGAAG ACTTGTTCGT AGAAATTCAA ACCCGCATCC AGCAAGGTGT GAAGAAAAAC 10680 

CAAGCTATCA AGGAAGTCGC TAAGATTTAC CAGTGGAATA AAAGTCAGCT CTACGCTGCC 10740 

TACCACGACT GGGAAGAAAA ACAATAAAGG GAGACAGGAT GTAATAATTC TGTCTGTTTC 10800 

TGTTTAACTT AATTAGTGAT GATAATATAA AGATGTATCA CTTGGTATAG AAGCTTTGGT 10860 

ATTAAGTTTT TTATTAAGCC CATACGGAAT ACCGATGGTT GGAGCAGCAG TTATAGCGTT 10920 

CTTAGAAGGT ATAAATAGAA AAATAAGGTC ATTTTAAATC AAAGGATTGA TAAATCAGAA 10980 
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AGAAGGTGAT TTTTTGCGAA CATACGAAAA TAAAGAAGAA CTAAAAGCTG AGATAGAGAA 11040 

AACATTTGAG AAATATATTT TAGAATTTGA TAATATTCCA GAAAATTTAA AAGATAAGAG 11100 

AGCTGATGAA GTTGACAGAA CTCCAGCAGA AAACCTTGCT TATCAGGTTG GTTGGACCAA 11160 

CTTGGTTCTT AAATGGGAAG AA6ATGAAAG AAAGGGGCTT CAAGTAAAAA CACCATCGGA 11220 

TAAATTTAAA TGGAATCAAC TTGGTGAATT ATATCAGTGG TTCACAGATA CCTACGCTCA 11280 

TTTATCTCTG CAAGAGTTGA AAGCAAAATT AAATGAAAAT ATTAATTCTA TCTCTGCAAT 11340 

GATTGATTCG TTGAGTGAGG AAGAATTATT TGAACCGCAT ATGAGAAAGT GGGCTGATGA 11400 

AGCGACTAAA ACAGCGACTT GGGAAGTGTA TAAGTTTATT CATGTAAATA CGGTTGCACC 11460 

TTTTGGAACT TTCAGAACTA AAATCAGAAA ATGGAAGAAG ATAGTATTAT AAATTATATT 11520 

TTTAACTTTA AAAAATTTCA TAAAAATGGT TACCAAAGGC GATAGAAGAA AAACTATCGT 11580 

CTTTTTCTTT GCAAATTTTT AAGAAGGGAG GTGATCTTGC ATGGACTTTG AATATTTTTA 11640 

TAACAGAGAA GCGGAAAGAT TTAACTTCTT AAAAGTACCG GAGATATTAG TTGATAGAGA 11700 

AGAATTTCGG GGCTTATCAG CAGAAGCAAT TATCCTTTAT TCCATACTTC TTAAACAGAC 11760 

AGGAATGTCA TTTAAGAATA ACTGGATAGA CAAGGAAGGC AGAGTATTTA TCTATTTTAC 11820 

TGTCGAAGAA ATTATGAAAA GAAGAAATAT CTCAAAGCCA ACTGCCATAA AAACATTAGA 11880 

TGAGCTTGAT GTAAAAAAGG AATAGGACTG ATCGAAAGAG TAAGGCTTGG ACTTGGTAAG 11940 

CCGAACATCA TTTATGTTAA AGACTTTATG AGTATATTTC AGGTAAAAGA AAATGACTTA 12000 

CAGAAGTCAA AAAACTTAAC TTCAGAAGTA AAAGATTTTA ACCTCAGAAG TAAAGAAAAT 12060 

GAACTTCAAG AGGTTAAGAA CCTTGACTCT AACTATATAG AGAATAATAA GAGTAAGTAT 12120 

AGTAAGAGAG AATATAGTTT TGGTGAAAAC GGACTTGGAA CATTTCAAAA TGTGTTTTTA 12180 

GCTGCTGAAG ATATATCGGA TTTACAAATC ATAATGAACT CACAGCTTGA GAATTACATT 12240 

AGACTTCCTG CAAAACTAGA ATCCTAGTTC ATGATTGATA ATGCCAGCAA TCAAATTCAT 12300 

TCGTAATCCG AAGCGTTTAC GATGATTTCG ATAGATTGTT GAAAACATTT TAAACGTTTT 12360 

TACTTTGGCA AAGATGTTCT CAATCTTGCT TCTCTCCTTG GATAGCGCAT GGTTACAGGC 12420 

TTTATCTTCA GCTGTTAGCG GCTTGAGTTT GCTGGATTTA CGT6GAGTTT GTACTTGAGG 12480 

ATATATCTTC ATGAGCCCTT GATAACCACT GTCA6ACAAG ATTTTACCAG CTTGTCCGAT 12540 

ATTTCTGCGA CTCATTTTGA ACAACTTCAT ATCACGACAA TAGTTCACAG CGATATCCAA 12600 

AGAAACAATT CTCCCTTGAC TTGTGACAAT CGCTTGAGCC TTCATAGCGT GAAATTTCTT 12660 

TTTACCAGAA TGATTCGCTA ATTCTTTTTT TAGGGCGATT GATTTTTACT TCCGTCGCAT 12720 

CAATCATTAC CGTGTCCTCA GAACTGAGAG GAGTTCTTGA AATCQTAACA CCACTTTGAA 12780 
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CAAGAGTTAC TTCAACCCAT TGGCTCCGAC GGATTAAGTT GCTTTCGTGA ATACCAAAAT 12840 

CAGCCGCAAT TTGTTCATAA GTTCGATATT CTCGCACATA TTGAAGAGTG GCCATAAGAA 12900 

GGTCTTCTAG GCTTAATTTA GGTTTTCGTC CACCTTTTGC GTGTTTAAGT TGATAAGCTG 12960 

TTTTTAATAC AGCTAATATC TCTTCAAAAG TCGTGCGCTG AACACCAACA AGACGCTTAA 13020 

ATCGTGCATC AGTTAGTTGT TTACTTGCTT CATCATTCAT AGAACTACTA TACCATATTT 13080 

TGTTTCGCAG GAAGTCTATT GGAAAGTAAG AAATATTGAA GCTGAGGCTA TTAGAAGAAA 13140 

TTGTGAGCGT GGTGCTATTT TTTCAGGTAA AATAAAATAT CACGAAGATT CACAGTTTAA 13200 

AGGAGATCAC TATGTTGAAT GTTATGCTGT TTTAGATAAT ACGGTTATAG CAAGAGATAG 13260 

AATAACAGTC CCTATCGATC CGTTATGTGG AAAAGATTTT ATAGAGTAGC ATATAATTGA 13320 

TTCTTAACTG GAATACTCAC TATCTCTTTA CATCAAGAAA ATGACTAAAC AGGGAAGTTT 13380 

GCCTTCTTCC CTTTTTTTGT TATACTAGTA GAAGAAAAAA TTAGAAAGAT TTGTGGGTGT 13440 

CAAACAGCCX: AGTGGGGTGT TTTAATATGG ACTTAGGTCC CACCCAAAGA GGTATTAGTG 13500 

TCGTGTCTCA ATCTTATATC AATGTTATCX3 GTGCTGGTTT GGCAGGTTCT GAAGCAGCTT 13560 

ACCAAATCGC AGAGCGTGGT ATTCCAGTTA AACTATATGA AATGCGTGGT GTCAAGTCTA 13620 

CACCCCAGCA TAAAACAGAC AATTTTGCTG AGTTGGTTTG TTCCAATTCT TTGCGTGGGG 13680 

ATGCTTTGAC AAATGCAGTT GGTCTTCTCA AGGAAGAAAT GCX5TCGCTTG GGTTCTGTTA 13740 

TCTTGGAATC TGCTGAGGCT ACACGTGTTC CTGCAGGTGG TGCCCTTGCA GTGGACCGTG 13800 

ATGGTTTCTC TCAAATGGTG ACCGAAAAAG TTGCCAACCA CCCCTTGATT GAAGTGGTTC 13860 

GTGATGAAAT TACAGAATTG CCGACAGATG TTATTACGGT TATCGCTACT GGTCCTTTGA 13920 

CAAGTGATGC CTTGGCTGAA AAGATTCATG CTCTTAATGA CGGTGCTGGT TTTTATTTCT 13980 

ACGATGCGGC AGCGCCTATT ATCGATGTCA ACACTATCGA TATGAGCAAG GTCTACCTCA 14040 

AATCACGTTA TGATAAGGGA GAAGCGGCCT ACCTCAATGC CCCTATGACC AAGCAAGA^T 14100 

TTATGGATTT CCATGAAGCT TTGGTCAATG CAGAAGAAGC ACCGCTTAGT TCTTTTGAAA 14160 

AAGAAAAGTA CTTTGAAGGA TGTATGCCTA TCGAAGTCAT GGCCAAACGT GGCATTAAAA 14220 

CTATGCTTTA TGGCCCTATG AAGCCAGTCG GTCTTGAGTA CCCAGACGAC TATACAGGAC 14280 

CTCGTGATGG AGAATTTAAA ACACCTTATG CGGTTGTGCA ACTTCGTCAG GATAATGCAG 14340 

CTGGTAGCCT CTACAATATT GTTGGTTTCC AGACCCACCT CAAATGGGGA GAACAAAAGC 14400 

GTGTCTTOCA AATGATTCCG GGTCTTGAAA ATGCGGAGTT TGTCCGTTAT GGTGTGATGC 14460 

ATCGCAATTC TTACATGGAT TCACCAAATC TTCTTGAGCA GACTTACCGT TCTAAGAAAC 14520 
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AACCAAATCT CTTCTTTGCT GGTCAAATGA CGGGTGTGGA AGGCTATGTT 6AGTCGGCGG 14580 

CTTCAGGCTT AGTTGCGGGA ATTAACGCAG CTCGTCTCTT CAAGGAAGAA AGCGAQGCTA 14640 

TTTTCCCCGA GACGACAGCG ATTGGAAGCT TAGCTCATTA CATTACCCAT GCCGACAGCA 14700 

AACATTTCCA ACCAATGAAT GTCAATTTTG GGATCATCAA GGAGTTGGAA GGCGAGCGTA 14760 

TCCGTGATAA GAAGGCTCGT TATGAAAAAA TTGCAGAGCG TGCCCTTGCC GACTTAGAGG 14820 

AATTTTTGAC TGTCTAATTT TTTTGAAAGA ATTGCTCATG ATACTATAAA AATCTTAGAA 14880 

ATTGTGATAA AATAGGTAGG ATGAAAGAAG GAGAGTGAAA ATGGCGAATC CCAA6TATAA 14940 

ACGTATTTTA ATCAAGTTAT CAGGTGAAGC CCTTGCCGGT GAACGTGGCG TAGGGATTGA 15000 

TATCCAAACA GTTCAAACAA TC6CAAAAGA GATTCAAGAA GTTCATAGCT TAGGTATCGA 15060 

AATTGCCCTT GTTATCGGTG GAGGAAATCT CTGGCGTGGA GAACCTGCAG CAGAAGCAGG 15120 

TATGGACCGT GTTCAGGCAG ATTACACAGG AATGCTTGGG ACTGTTATGA ATGCTCTTGT 15180 

GATGGCAGAT TCATTGCAAC AAGTTGGGGT TGATACGCGT GTACAAACAG CTATTGCCAT 15240 

GCAACAAGTG GCAGAGCCTT ATGTCCGTGG ACGTGCCCTT CGTCACCTTG AAAAAGGCCG 15300 

TATCGTTATC TTTGGTGCTG GAATTGGTTC ACCTTACTTC TCGACAGATA CAACAGCGGC 15350 

CCTTCGTGCA GCTGAAATCG AAGCAGATGC CATCCTCATG GCTAAAAATG GTGTCGATGG 15420 

TGTTTACAAT GCCGATCCTA AGAAAGATAA GACAGCTGTT AAGTTTGAAG AATTGACCCA 15480 

CCGTGACGTT ATCAATAAAG GTCTTCGTAT CATGGACTCA ACA6CTTCAA CCCTCTCAAT 15540 

GGACAACGAC ATTGACTTGG TTGTATTCAA CATGAACCAA CCAGGCAACA TCAAACGTCT 15600 

CGTATTTGGT GAAAATATCG GAACAACAGT TTCAAATAAT ATCGAAGAAA AGGAATAAGA 15660 

AAGAATATGG CTAACXXIAAT TATTGAAAAA GCTAAAGAGA GAATGACCCA GTCTCACCAA 15720 

TCACTTGCTC GTGAATTTGG TGGTATCCGT GCTGGTCGTG CCAATGCAAG CTTGCTTGAC 15780 

CGTGTACATG TAGAATACTA TGGAGTCGAA ACTCCTCTTA ACCAAATCGC TTCAATTACG 15840 

ATTCCAGAAG CGCGTGTTTT GTTGGTAACA CCATTTGACA AGTCTTCATT GAAAGACATC 15900 

GAACGTGCCT TGAACGCTTC TGATATTGGT ATCACACCGG CTAATGACGG TTCTGTGATT 15960 

CGCTTGGTTA TCCCAGCTCT TACAGAAGAA ACTCGTCGTG ACCTTGCTAA AGAAGTGAAG 16020 

AAGGTCGGCG AAAATGCTAA AGTGGCTGTC CGCAATATCC GTCGCGATGC TATGGACGAA 16080 

GCTAAGAAAC GAGAAAAAGC AAAAGAAATC ACTGAAGACG AATTGAAGAC TCTTGAAAAA 16140 

GACATTCAAA AAGTAACAGA CGATGCTGTT AAACACATCG ACGACATGAC TGCTAACAAA 16200 

GAGAAAGAAC TTTTGGAAGT CTAAAAATAA ACAGAAAAAC TCAGTTGGCA TTGCTGGCTG 16260 

AGTTTTATTC GAAAGAAGGA AATATGAATA CAAATCTTGC AAGTTTTATC GTTGGACTGA 16320 
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TCATCGATGA 


AAAC6ACCGT 


TTTTACTTTG 


TGCAAAAGGA TGGTCAAACC 


TATGCTCTTG 


16380 


CTAAGGAAGA 


AGGCCAACAT 


ACAGTAGGGG ATACGGTCAA AGGTTTTGCA TACACGGATA 


16440 


TGAAGCAAAA 


ACTCCGCCTG 


ACAACCTTAG 


AAGTGACTGC CACTCAGGAC 


CAATTTGGTT 


16500 


GGGGACGTGT 


CACAGAGGTT 


CGTAAGGACT 


TGGGTGTCTT TGTGGATACA 


GGCCTTCCTG 


16560 


ACAAGGAAAT 


CGTTGTGTCA 


CTCGATATTC 


TCCCTGAGCT CAAGGAACTC 


TGGCCTAAGA 


16620 


AGGGCGACCA 


ACTCTACATC 


CGTCTTGAAG 


TGGATAAGAA AGACCGTATC 


TGGGGCCTCT 


16680 


TGGCTTATCA 


AGAAGACTTC 


CAACGTCTTG 


CTCGTCCTGC CTACAACAAC 


ATGCAGAACC 


16740 


AAAACTGGCC AGCCATTGTT 


TACCGTCTCA AGCTGTCAGG AACTTTTGTT 


TACCTACCAG 


16800 


AAAATAATAT GCTTGGTTTT ATTCATCCTA GCGAGCGTTA CGCAGAGCCA CGTTTGGGGC 


16860 


AAGTATTAGA 


TGCGCGCGTT 


ATTGGTTTCC 


GTGAAGTGGA CCGCACTCTG 


AACCTCTCCC 


16920 


TCAAACCACG 


CTCCTTTGAA ATGTTGGAAA ACGATGCTCA GATGATTTTG 


ACTTATTTGG 


16980 


AAAGCAATGG 


CGGTTTCATG 


ACCTTAAATG 


ACAAGTCATC TCCAGACGAC 


ATCAAGGCAA 


17040 


CCTTTGGCAT 


TTCTAAAGGT 


CAGTTCAAGA 


AAGCTTTAGG TGGTCTTATG 


AAGGCTGGTA 


17100 


AAATCAAGCA 


GGACCAGTTT 


GGGACAGAGT 


TGATTTAGGG AGGCTTATGA 


GAAAATCATT 


17160 


TTACACTTGG 


CTCATGACCG 


AGCGCAATCC 


TAAAAGTAAC AGTCCCAAAG 


CAATTTTGGC 


17220 


AGACCTCGCT 


TTTGAAGAGT 


CAGCCTTTCC 


AAAACACACA GATGATTTTG 


ATGAGGTCAG 


17280 


TCGCTTTTTG 


GAGGAGCATG 


CCAGTTTCTC 


TTTTAACCTA GGAGATTTTG 


ACAGCATTTG 


17340 


GCACGAATAT 


CTAGAACACT 


AGCATTTATT 


CATTGGGTTT GGGCTAGTAA 


TTTCTCCATC 


17400 


CCTCTGCTAT 


AATAAAAAGA 


AATAAAAGGA 


TTAGAGAGGT TCTTTATTTG 


AAGGAACATT 


17460 


CAATAGACAT 


TCAACTGAGT 


CATCCAGATG 


ACCTGTTTCA TCTTTTTGGT 


TCCAATGAAC 


17520 


GCCATCTTCX3 


TTTGATOGAA 


GAAGAGCTTG 


ATGTTGTGAT TCATGCTCGT 


ACGGAGATTG 


17580 


TCCAGGTTTT GGGAGAAGAG 


TCTGCCTGTG 


AGGAAGCCCG TCAAGTTATT 


CAGGCTTTGA 


17640 


TGGTCTTGGT 


AAATCGTGGG 


ATGACCGTTG 


GTACGCCAGA TGTAGTCACT 


GCGATTAGCA 


17700 


TGGTCAAAAA 


TGATGAAATT 


GACAAGTTTG 


TCGCCCTTTA CGAAGAAGAA 


ATTATCAAGG 


17760 


ATAATACTGG 


GAAACCTATC 


CGTGTCAAAA 


CCCTAGGGCA AAAGCTTTAT 


GTGGACAGTG 


17820 


TCAAACAGCA TGATGTGACC 


TTTGGAATTG 


GGCCAGCAGG TACAGGGAAG 


ACCTTCCTTG 


17880 


CAGTGACCTT GGCAGTGACT GCCCTTAAAC 


GTGGGCAAGT CAAGCGAATT 


ATCCTAACTC 


17940 


GTCCAGCGGT GGAAGCGGGA GAGAGTCTT6 GATTTCTTCC GGGTGATCTT AAGGAGAAGG 


18000 


TGGATCCTTA CCTTCGTCCT 


GTTTACXSATG 


CCTTGTATCA AATTCTTGGG AAAGACCAAA 


18060 
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CGACTCGTCT 


CATGGAGCGT GAAATTATCG 


AAATTGCGCC 


CCTTGCCTAT 


ATGCX3TGGCC 


18120 


GGACCTTGGA 


TGATGCCTTT 


GTCATTCTCX5 


ATGAGGCGCA AAACACGACC 


ATCATGCAGA 


18180 


TGAAGATGTT 


CTTGACGCGT 


TTAGGTTTTC 


ATTCTAAGAT GATTGTCAAT 


GGAGATATTA 


18240 


GTCAGATTGA 


CCTGCCACGT 


AATGTCAAGT 


CCGGTTTGAT 


TGATGCTCAA 


GAGAAACTCA 


18300 


AGAACATCCA 


TCAGATTGAC 


TTTGTTCATT 


TTTCAGCCAA 


GGATGTGGTT 


CGCCATCCTG 


18360 


TTGTCGCTCA 


GATTATCCGA 


GCCTATGAAT 


ATTCTACTGA 


AGTTGCACAC 


GACTGATTTT 


18420 


GAGGAAGTTC 


GCCTGCAAAA 


GAATAGACTT 


GTTCGGTAAC 


TGTAAAAAGT 


GTTATACTAT 


18480 


TTTTATGGAA ACAGTATACG 


ACAAAGCACA 


AAAACTTAAC 


TCAAAAAACT 


TCAAACTATT 


18540 


GATTGGTGTC 


AAAAAGGAAA 


CCTTTCAACT 


CATGCTAGAA 


CACCTGAATT 


CAGCCTATCA 


18600 


GATTCAGCAC 


CGAAAAGGTG 


GACGTCCACG 


TAGTCTGCCC 


ATGGAAGACC 


AGCTCATTAT 


18660 


GACCCTCCGT 


TACTTGCGAT 


ATTATCCCAC 


TCAGCGTCTG 


CTGGCCTTTG 


ATTTTGGCGT 


18720 


CQGTGTAGCT 


ACGGTAAATG 


CCATCATCAC 


TTGGGTGGAG 


GATACACTTC 


GTGCGTCAGG 


18780 


TAGCTTTGAT 


TTGGACCATT 


TAGAAGCCCC 


GAGTGCTGCT 


GTGGCTATTG 


ACGTGACCGA 


18840 


AAGTCCGATT 


CAGCGTCCAA 


ACT^AAACCAA 


AGCAAAAATT 


ATTCTGGTAA 


AAAGAAACGA 


18900 


CACACCTTAA 


AAACTCAAAT 


TATGCTGGAT 


TTGACGACAC 


ATAAAGTCTG 


TCAAATGGCC 


18960 


TTTTCTGACG 


GACATACGCA 


TGATTTTACT 


CTCTTCAAAG 


AAAGTATTGG 


ACAAAGTTTG 


19020 


CCTGAAACGA 


CGCTTGCCTT 


TGTTGACCTA 


6GTTATTTAG 


GCATCTTGAA 


ATTTCATGAG 


19080 


AATACTTTCA 


TTCCTGCTAA AAATTCCAAA 


AATCGCCGCC 


TGAGTGAGGA 


TGATAAGCAG 


19140 


TTAAATAAAG AGATGTCAGC GATACGAATT 


GAAATTGAAC ATTTTAACGC 


TAAATTCAAG 


19200 


ACCTTCCAAA TCATGTCAGT CCCTTATCGT AACCGCAGAA AACGTTTCX3A GTTACGGGCG 


19260 


GAATTAATTT GTGCCATCAT 


CAATTATGAA 


GTGAACTAGA 


TTCCXSAACAA 


GTCTAATATA 


19320 


CTTTTGAGAG 


AGGAAAATCC 


AGTTGTATAG 


GCTAAAGGTT 


TTATCCAAAG 


GTCTGAGACA 


19380 


ACGATTAGGC ACGATGGAAA 


GAACTTTTAT 


GTGGCTGATG 


ACGATCAGTG 


CATCTTCCTG 


19440 


TGTCATAATC 


ACAGGGCACA 


AGAAAGTAGG 


AATTTGAAAA 


GATGATTGAC 


CAACTATCTA 


19500 


AGTATTACAG 


TTGTAG6ATA 


CTAACTGAAA 


AGGATATTCC 


AAGTATTTTA 


TCTTTATATG 


19560 


AAAGTAATCC 


TCTGTATTTT 


CAGCATTGTC 


CACCAGAGCC 


AAATTTTGCA 


ACTGTAAAAG 


19620 


AGGACATGCT TTGTCTACCT 


GAAGGTAAAG 


CTAAGGCTGA 


TAAGTTTTTT 


GTTGGATTTT 


19680 


GGAATGGATC 


TGACCTTGTG 


GCTGTTATGG 


ATTTTGTCTA 


TGCATATCCT 


GATGAGGAGA 


19740 


CTGTTTTTAT 


TGGTTTGTTT 


ATGGTTGATC 


AAGCCTATCA 


GAGAAAAGGG 


ATTGOTAGTC 


19800 


ATATTGTGAC AGAAGCACTA 


GCTTATTTTG 


CTAAGAACTT 


TCGAAAGGCA 


CGTTTGGCTT 


19860 
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ATGTTAAGGG AAATCCGCAA TCTCAGCATT TTTGGGAAAA GCAGGGCTTT AAATCAATTG 19920 

GATGCGAGGT TAAGCAAGAA CTCTATACGG TTGTTATCGC TGAACAGAGC CTAGAAGATT 19980 

AGAAATGGCA TCAAGTAAGA ACTATTTGGA ATTTGTTTTG GAACAATTAT CAGGATTAGA 20040 

TGATGTGACT TACCGTTCCA TGATGGGGGA GTATATTCTT TACTTCCGCG GCAAGATTAT 20100 

TGGCGGCATT TATGACGATC GCTTTTTAGT TAAACCCGTG CAAGCAGTCT TAGATAAGAT 20160 

TGACCAATCT TCTTTTGAGT TTCCATACAA AGGTGCCAAA GAAATGATTT GAGTGGAAGA 20220 

ACTTGATAAT AAGATGTTTC TATAAGACCT AATTTTAGCT ATGTATAACC AACTGCCAAC 20280 

GCCCAAACCT AAAAAGAAAA AGCAAGGGTG AACGAAGTAA AAAAGAAGTC TGCTAAGGCC 20340 

CTGTCTTTGC ACGGGTAAAA TTTTATATAT AAAAAGAAGC TGGGACTAAA GAGCTCAGCT 20400 

TCCTTTGGTT TATATAATTG TCATTACAAG ACGAAGTGGT TGGGCGAAAC TCTGTTGACT 20460 

TTATTCAATT TAGAGTTTCT TATGCACAAT TGAGTCTGGA ACGAAAGTCT CCAGTTGCAA 20520 

AGTATACAGT ACAATAAACC AACGATGTAA TAGCTGATGA CACAAAGCAC AGTGGGTAGG 20580 

ACTTGCGAAG TCACCCTTTT CTTTTCAAAA TTTATACTAA ATCATTGATA TCAGTGTAGT 20640 

CACGATTAAG TCCTTGAGCA ACTGGTAGGT TAGTCAAGTA ACCTTGATAA GTAGTCACAC 20700 

CTTGACGCAA GCCTTCATCT TCAGAGATTG CTTGTGCGAA TCCTTTGCCA GCCAAAGCTT 20760 

CGATATAAGG AAGAGTGACA TTGGTTAGGG CGATGGTTGA AGTGCGAGCA ACCGCACCAG 20920 

GGATATTGGC AACGGCATAG TGGAGAACAC CGTGTTTTTC ATAGACGGGT TCATCGTGCG 20880 

TTGTCACACG GTCAGCTGTT TCGATAACGC CACCTTGGTC AACAGCAACG TCAACGATAC 20940 

AGAGCCTGGA CGCATTTGTT TGACCATCTC ATCTGTCACC AATTCOGGTG CTTTTGCACC 21000 

AGGGATGAGA ATGGCTCCAA TCACCACATC AGCATCTCTC ACACTTGCTT CAATGTTGAA 21060 

TGAATTAGAC ATAAGAGTTT GAATTTGACT TCCAAAGACT TCTTCTAGAA CTGAGAGACG 21120 

CTTGGAACTA ATATCTAAAA TAGTCACTTG AGCACCAAGA CCAAGGGCGA TGCGGGCAGC 21180 

ATGTGTACCG ACGACACCAC CACCGATGAT AGTTACTTTT CCTTTTGQAA CACCTGGTAC 21240 

ACCACCAAGT AGAACACCAG AGCCACCAGC TTGCTTAGTA AGGAAGTGAG CTCCGATTTG 21300 

AACAGCCATA CGACCTGCAA CCTCACTCAT AGGAACGAGG AGCGGTAGTT GTCCTTGATT 21360 

GTCACGAACA GTTTCAGTTG TTTTTGCTGT TAACATAGCA TCTGCTAATT CTGGAGCAGC 21420 

GGCCATGTGC AAGTAGGTGA AGAGAAGAAG ATCGTCGCGC AAGTAACCGT ATTCAGAACT 21480 

TAAAGATTCT TTTACTTTCA CAACCAACTC TGCtGCCCMi. GCTTCACCAG CAGTAGCGAC 21540 

AATCTCAGCT CCTTGCTTTT GATAGTCAGC ATCAGTAAAG CCAGAACCGA GACCAGCATT 21600 
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TGTTTCGATA AGGACACGAT GACCACGACT AACTAAGCTA TGAACACCTG CAGGTGTGAG 21660 

GGCGACACGG TTTTCGTTAT TTTTAATTTC TTTTGGGATT CCGATTAACA TTGAGATAAC 21720 

CTACCTTTCA ATTGACX3GTC TTGTTTTGGT TGTCACATTC CAGTTCATAA ATCAAAAATG 21780 

TGACGGTTTC ATTGTATATG AAACCGCTTC AAAAATCAAG AAAAACTTGT CATCCAAATT 21840 

TTTTTATGCT AGACTAGTGA AAATCAAGCT CTAATGGAGG GAAAAGTATG GAATCAATAT 21900 

TTGTGAAATT TGCCCAGTAT CCGTCTATAG AAACGGAGCG TTTATTGCTC AGACCTGTAA 21960 

CTTTGGATGA TGCGGAAcAA TGTTTGACTA TGCCTOGGAC AAGGGTAATA CACGTTACAC 2202O 

TTTTCCAACC AATCAAAGCT TGGAAGAAAC CAAGAATAAC ATTGCTCAGT TCTACTTGGC 22080 

TAATCCCTTG GGACGTTGGG GAATAGAACT AAAAAGCAAT GGTCAGTTTA TTGGAACCAT 22140 

TGACTTGCAC AAGATTGATT CTGTTCTTAA GAAGGCAGCT ATTGGCTACA TTATCAATAA 22200 

AAAGTATTG6 AATCAAGGAT TAACGACAGA AGCCAATCGT GCTGTGATTG AGCTAGCTTT 22260 

T6AGAAGATA GGGATGAATA AGTTGACTGC CCTTCACGAT AAGGCTAATC CCGCGTCAGG 22320 

AAAGGTCATG GAGAAATCAG GCATGCGTTT TTCCCATGCA GAACCATATG CTTGTATGGA 22380 

CCAGCATGAA AAAGGCCGAA TCGTGACAAG AGTTCATTAT GTCTTGACCA AGGAAGACTA 22440 

TTTTGCAAAT AAATAAGCAG TTGAAAAGAA ATTTTTCGAC TGTTTTTTCT TCCTCTTACG 22500 

AATAATCTAA GAGAGGAGAA AATATGGAAG CAATTATCGA GAAAATCAAA GAGTATAAAA 22560 

TCATCGTCAT CTGTACTGGT CTGGGCTTGC TTGTAGGAGG ATTTTTCCTG CTAAAACCAG 22620 

CTCCACAAAC ACCTGTCAAA GAGACGAATT TGCAGGCTGA AGTTGCAGCT GTTTCCAAGG 22680 

ACTCATC6AC CGAAAAGGAA GTGAAGAAGG AAGAAAAGGA AGAACCCCTT GAACAAGATC 22740 

TAATCACAGT AGATGTCAAA GGTGCTGTCA AATCGCCAGG GATTTATGAC TTGCCTGTAG 22800 

GTAGTCGAGT CAATGATGCT GTTCAGAAGG CTGGTGGCTT GACAGAGCAA GCAGACAGCA 22860 

AGTCGCTCAA TCTAGCTCAG AAAGTTAGTG ATGAGGCTCT GGTTTACGTT CCTACTAAGC 22920 

GAGAAGAAGC AGTTAGTCAA CAGACTGGTT CGGGGACAGC TTCTTCAACA AGCAAGGAAA 22980 

AGAAGGTCAA TCTCAACAAG GCCAGTCTGG AAGAACTCAA GCAGGTCAAG GGACTGGGAG 23040 

GAAAACGAGC TCA6GACATT ATTGACCATC GTGAOGCAAA TGGCAAGTTC AAGTCAGTAG 23100 

ACGAGCTCAA GAAOGTCTCT GGCATTGGTG GCAAAACAAT AGAAAAGCTT AAAGACTATG 23160 

TTACAGTGGA TTAAGAATTT CTCTATTCCC CTAATTTACC TGAGTTTTCT ATTACTTTGG 23220 

CTTTATTACG CTATTTTCTC AGCATCTTAT CTTGCTTTGT TGGGCTTTGT TTTTCTGCTA 23280 

GTCTGTCTCT TTATCCAATT TCCGTGGAAA TCTGCTGGTA AAGTTCTAAT AATTTGCGGA 23340 

ATCTTTQGAT TTTGGTTTGT TTTTCAAAAT TGGCAACAGA GTCAAGCGAG TCAAAATCTG 23400 
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GCGGATTCTG TTGAAAGGGT ACGGATTTTG CCTGATACTA TTAAGGTTAA TGGTGATAGT 23460 

CTATCCTTTC GTGGCAAGTC TAACGGTCGT GCTTTCCAAG TCTATTATAA ACTCCAGTCC 23520 

GAGGAGGA6A AAGAAGCCTT TCAAGCTTTA ACTGACCTGC ATGAGATAGG ACTAGAAGGG 23580 

AAGCTTTCGG AGCCAGAAGG GCAGAGAAAT TTTGGTGGCT TTAATTACCA AGCCTATCTG 23640 

AAGACTCAGG GAATTTACCA GACTCTCAAT ATCAAAACAA TCCAGTCACT TCAAAAGATT 23700 

GGCAGTTGGG ATATAGGAGA AAACTTGTCC AGTTTACGTC GAAAGGCTGT GGTTTGGATT 23760 

AAGACGCACT TTCCAGACCC TATGGGCAAT TACATGACAG GACTCTTGCT GGGACATCTG 23820 

GACACCGACT TTGAGGAGAT GAATGAGCTT TATTCCAGTC TAGGAATTAT CCACCTCTTT 23880 

GCCCTATCTG GCATGCAGGT AGGTTTTTTC ATGAATGGAT TTAAGAAACT TCTCTTGCXJA 23940 

TTGGGCTTGA CCCAAGAAAA GTTGAAATGG CTGACTTATC CCTTTTCCCT TATCTAT6CG 24000 

GGACTAACTG GATTTTCAGC ATCGGTTATT CGCAGTCTCT TGCAAAAGCT ACTGGCTCAA 24060 

CATGGGGTTA AGGGCTTG6A TAATTTTGCC TTGAC6GTGC TTGTCCTCTT TATTGTCATG 24120 

CCAAACTTTT TCTTGACAGC AGGAGGA6TC TTGTCCTGCG CTTATGCTTT TATCCTGACC 24180 

ATGACCAGCA AA6AAGGGGA GGGGCTCAAG GCTGTTACTA GTGAAAGTCT A6TCATCTCC 24240 

TTGGGCATAT TGCCCATTCT ATCCTTCTAT TTTGCX3GAAT TTCAACCTTG GTCTATCCTT 24300 

TTGACCTTTG TCTTTTCCTT TCTTTTTGAC TTGGTCTTCT TACCGCTCTT GTCTATCTTA 24360 

TTTGTCCTTT CCTTTCTCTA TCCAGTCATT CAGCTGAACT TTATCTTTGA ATGGTTAGAG 24420 

GGCATTATTC GCTTGGTCTC GCAGGTGGCA AGGAGACCAC TTGTCTTTGG TCAACCCAAC 24480 

GCATGGCTTT TAATCTTATT GTTAATTTCC TTGGCTTTGG TCTATGATTT GAGGAAAAAC 24540 

ATTAAAGGAT TAACAGTATT GAGTTTATTG ATTACAGGTC TCTTTTTCCT TACCAAGTAT 24600 

CCACTGGAAA ATGAAATCAC CATGCTGGAT GTGGGGCAAG GAGAAAGTAT TTTCTACGGG 24660 

ATGTAACTG6 GAAAACCATT CTCATAGATG TAGGTGGTAA GGCAGAATCT TATAAGAAAA 24720 

TCAAAAAATG GCAAGAAAAG ATGACGACCA GCAATGCCCA GCGAACCTTG ATTCCCTATC 24780 

TCAAAA6TC6 AGGAGTAGCT AAGATTGACC AGCTAATTTT GACTAACACG GACAAGGAGC 24840 

ATGTTGGAGA TTTGTCAGAG ATGACCAAGG CTTTCCATGT AGGGGAGATT CTAGTATCAA 24900 

AAGACAGTCT GAAACAGAAG GAATTTGTGG CAGAACTACA GGCGACTCAA ACAAAGGTGC 24960 

GTAGTATGAT AGTAGGGGAG AACTTGCCCA TTTTTGGAAG TCAGTTAGAA GTTCTATCTC 25020 

CAAGGAAAAT GGGAGATGGA GGACACGAT6 ATACCCTAGT TCTGTATGGG AAATTCTTGG 25080 

ATAAGCAATT TCTCTTCACG GGAAATTTGG AGGAGAAAGG AGAGAAGGAC TTGCTGAAGC 25140 
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ACTATCCAGA CTTGAAAGTA AATGTTTTGA AAGCTAGCCA ACATGGCAAT AAAAAATCAT 25200 

CAAGTCCAGC CTTTCTAGAA AAACTCAAAC CAGAGCTTAC TCTTATCTCA GTTGGAAAGA 25260 

GCAATCGAAT GAAACTCCCC CATCAGGAAA CATTGACACG ACTGGAAGGT ATCAATAGCA 25320 

AAGTTTATCG AACTGACCAG CAAGGAGCTA TACGTTTTAA GGGGTTGGAT AGTTGGAAAA 25380 

TCGAAAGTGT TCGATAGGAA GGATAAATGT TGTAGATTAG TGAAATAAAC TAAAAATTTG 25440 

TTGCATAATA ATGATAAAAA TGGTATAATG AAAACGTATT CAATATTGAG GATATAAAAT 25500 

CATTAAAAAT CAGCAAAAGT TGTTTTATTA GTTAGTTTAT AATCTATTGG TCTTCTTCAG 25560 

TCCAGTGTAT CTGCTGTGAC AGTCACTAAA AGTTACAAGT ATGATTGGAA TACGGTTTGG 25620 

GAATATAGTA CCAACTATCA COACCATCAG TATGCTTGGA TTCCGTCATG GTCTCX3TTAT 25680 

GACAGCTATT CTGAGTATAA AGTTGGCGGA GGCTGGAACT ACGCTCGTTA TGAGGTCATA 25740 

AACTATTACA GCGGAGGCTA TTAATTCTTA AAGAGT6AGA AAAAGGAGG6 CTAGATATGT 25800 

TGCAGCTTAC TCATGTGACC TTAAAAACGC GACAAGTCAT CTTGCAAGAT GTGGATTTCA 25860 

CCTTTAAAAA GGGTAGGGTT TATGGTCTTC TTGCTATCAA TGGCTCTGGA AAGACGACCC 25920 

TGTTCCGTGC CATTAGCAAT TTAATTCCCA TAAGTAGTGG AAATATCGCA GCCCCTCCTT 25980 

CTTTATTTTA TTATGAGAGT ATTGAATGGC TGGATGGAAA CTTAAGTGGG ATGGACTACC 26040 

TTCGTCTTAT CAAAAACATC TGGAAGTCAG GTCTGAACTT GAGGGATGAA ATCGCCTATT 26100 

GGGAAATGTC TGACTATATC AGTCTTCCCA TTCGCAAGTA TTCCTTAGGC ATGAAGCAAC 26160 

GCTTGGTGAT TGCCATGTAT TTCCTCAGTC AGGCCAAATG CTGGCTCATG . GATGAGATTA 26220 

CAAATGGCTT AGATGAGTAT TATCX3ACAGA AGTTTTTTGA TAGGCTAGCA CAAATCGATA 26280 

GACAAGAACA GCTGGTTCTT TTAAGTTCCC ACTATAAGGA AGAGTTGGTT GATGTCTGCG 26340 

ATAGAGTAGT AACCATTCAT CAGGGGCAGA TAGAAGAGGT TTAGTTTATG AAAGATGTTA 26400 

GTCTATTTTT ATTGAAAAAA GTTTTCAAAA GCCX3CTTAAA CTGGATTGTC TTAGCTTTAT 26460 

TTGTATCTGT ACTCGGTGTT ACCTTTTATT TAAATAGTCA GACTGCAAAC TCACACAGCT 26520 

TGGAGAGCAG GTTGGAAAGT CGCATTGCAG CCAACGAGAG GGCTATCAAT GAAAATGAAG 26580 

AGAAACTCTC CCAAATGTCT GATACCAGCT CGGAGGAATA CCAGTTTGCT AAAAATAATT 26640 

TAGACGTGCA AAAAAATCTT TTGACGCGAA A6ACA6AAAT TCTGACTTTA TTAAAAGAAG 26700 

GGCGCTGGAA AGAAGCCTAC TATTTGCAGT GGCAAGATGA AGAGAAGAAT TATGAATTTG 26760 

TATCAAATGA CX^CGACTGCT AGCCCTGGCT TAAAAATGGG GGTTGACCGC GAACGGAAGA 26820 

TTTACCAAGC CCTGTATCCC TTGAACATAA AAGCACATAC TTTGGAGTTT CCGACCCACG 26880 

GGATTGATCA GATT6TCTG6 ATTTTAGAGG TTATCATCCC AAGTTTGTTT GTGGTTGCTA 26940 
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TTATTTTTAT GCTAACACAA CTATTTGCAG AAAGATATCA AAATCATCTG GACACAGCTC 27000 

ACTTATATCC TGTTTCAAAA GTGACATTTG CAATATCCTC TCTTGGAGTT GGAGTGGGAT 27060 

ATGTAACTGT GCTGTTTATC GGAATCTGTG GCTTTTCTTT TCTAGTGGGA AGTCTGATAA 27120 

GTGGTTTTGG ACAGTTAGAT TATCCCTACC CTUVrTTATAG CTTAGTGAAT CAAGAAGTAA 27180 

CTATTGGGAA AATACAAGAT GTATTATTTC CTGGCTTGCT CTTAGCTTTC TTAGCCTTTA 27240 

TCGTCATTGT GGAAGTTGTG TACTTGATTG CTTACTTTTT CAAGCAAAAA ATGC?CTGTCC 27300 

TCTTTCTTTC ACTCATTGGG ATTGTTGGCT TATTGTTTGG TATCCAAACC ATTCAGCCTC 27360 

TTCAAAGGAT TGCACATCTG ATTCCCTTTA CTTACTTGCG TTCAGTGGAG ATTTTATCTG 27420 

GAAGATTACC TAAGCAGATT GATAATGTCG ATCTAAATTG GAGCATGGGA ATGGTCTTAC 27480 

TTCCTTGCCT GATTATCTTT TTGCTATTGG GAATTCTATT TATTGAAAGA TGGGGAAGTT 27540 

CACAGAAAAA AGAATTTTTT AATAGATTCT AGCTTTCCTA TAGGTAGGGA AAATAAGTAA 27600 

AAACTAACAT AGAGAGGGAA TCAACTTGAT TCTCTCTTTT TGAITCGAAA ACCAAACCAA 27660 

AATACAAACA CAAACTTTTC AAAAAATAAC TTTTTATCTT GACAAGAGCT AGAAAACTTG 27720 

GTATCATATA AAAGTTGAGA AAAGCAGAAG TGAGAGCTTC TCGCCTTGTG ACATTAAGTT 27780 

GCCTGGCCCT ACGGAT6AAA AGTTTCGAAG AAACGCTATC ATAACGTGCG GGCTTGTATA 27840 

TTTACAAGTC CGCTATTGTT TTTCTCTAAT AAAACAAAAG AGGTGAAAAC CATAGCAAAG 27900 

CAAGACTTAT TCATCAATGA TGAGATTCGT GTACGTGAAG TTCGCTTGAT TGGTCTTGAA 27960 

GGAGAACAGC TAGGTATCAA GCCACTCAGT GAAGCGCAAG CTTTGGCTGA TAACGCTAAT 28020 

GTTGACCTAG TATTGATTCA ACCCCAAGCC AAACCGCCTG TTGCAAAAAT TATGGACTAC 28080 

GGTAAGTTCA AATTTGAGTA CCAGAAGAAG CAAAAAGAAC AAC6TAAAAA ACAAAGCGTT 28140 

GTTACTGTGA AAGAAGTTCG TCTAAGTCCG G 28171 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7147 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CCGCTCAACT TTTGCAATCA AGGCTAAGTA GACAGCAGCA AATTTCATAT TGTATAATTT 60 

CTGACTCATA CTTCTCTCTT TCTATGTGTA CTAGTATAAA TAA6AAAAAG AAGGCCGTCA 120 
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AGCCTTCTTT TGATTTATTC TTCTGCTTCA TCTTCTGTAA ATTGACTATT GTACAAGTCA 180 

GCGTAGAAGC CACCTTGCGC CATCAGTTCC TCATAGTTGC CTTGCTCGAT GATATTTCXTA 240 

TCTTTCATGA CCAAGATCAA GTCTGCATTT CGGATGGTTG ACAAGCGGTG GGCAATGACA 300 

AAGGATGTGC GTCCTTCCAT CAAACGGTCC ATGGCTTTTT GGATCAATTC CTCTGTCCGT 360 

GTGTCAACAG AAGAAGTCGC CTCATCCAAA ATCAAAAGCG GTGCATCCTT AAGAAGGGCA 420 

CGAGCAATAG TCAATAGTTG TTTTTGTCTT ACAGACAAGG TCACGGTGTC ATCCAAGATG 480 

GTATCATAGC CATCTGGCAA GGTCATAATA AAGTGGTGAA TTCCCACAGC CTTACTAGCT 540 

TCCATCATTC GTTCATCACT AATCCCTATT TGATTATAGA TGAGATTGTC TCGAATAGTT 600 

CCTTCAAAGA GCCAGGTATC CTGCAAGACC ATTGAAAAGG CATCATGCAC TTCTGAACGC 660 

GTCATAGCCT TGGTATCCAC ACCATCAATG CGAATACTTC CCTTATCAAT CTCATAGAAT 720 

TTCATCAAAA GATTGACAAT GGTTGTCTTA CCAGCCCCAG TCGGCCCAAC AATGGCAACC 780 

TTTTGACCAG CATGAGCTGT CGCAGAGAAG TCATAGTCTT GAACATTGAC ACCGTCCACC 840 

AGAATTTCTC CTGCTGACAC GTCGTAGAAA CGTGGAATCA GATTGACCAG AGTTGATTTA 900 

CCAGAACCTG TTGACCCAAT AAAGGCCACT GTTTGACCAG TTTCTGCTTT AAAGCTAACA 960 

TGTTCAATAA CTGCCTCCGA ATTTGCCGCA TAGCGgAAGG TCACATCCTT AAACTCGACC 1020 

TGACCTTTGA AGTTTTCATC AGTCAGCTGC ACTTGAACAG GGTTTTGGAT AGAAGAATGC 1080 

AAATCTAAAA CTTGATTAAT CCGCTTAGCA GAGACCATAG TTCGGGGAAG AACGATGAAG 1140 

AGTGCTCCCA TGAGAAGGAA GCCCATGACA ACCTACATGG CATAAGACAT GAAAACAATC 1200 

ATGTCACTAA A6AGAG6CAG ACGCGCTATC GGAGCAGCGT CGTTAATCAC ATAGGCCCCA 1260 

ATCCAGTAAA TCGCCACACT CAAACCACTT GAAATCCCCA TCATGATAGG ATTCAAAATA 1320 

GCCATAAGAC GGTTGACAAA CAAATTCAAA CGGGTCAATT CATCATTTAC TGCTGCAAAT 1380 

TTTTCATTTT GATAATCCTC TGCATTGTAG GCACGAACGA CACGAATACC TGTTAAACTC 1440 

TCACGAGTGA TACTGTTCAG TTTATCTGTC AGCCCCTGAA TCAAGGACTG TTTTGGAAAG 1500 

GCTAGCGTCA TCAAAACGGT CGTCATCAGG ACGTTGATAA TCACTGCCAC AAGTACGGCC 1560 

CAGAGCCAGT ATTCTGAATG ACCTAAAATC TTCCCAATAG CCCAGATAGC CATAATTGAA 1620 

CCACGCGTTA CCACTTGCAA GCCCATAGTA ATCAACATTT GAACTTGAGT AATGTCATTG 1680 

GTAGTACGCG TCAAGAGGCT AGGAATTGAA AATTTCTTAA TCTCTGTCTG CGAGTAATCC 1740 

AAAACTCGGT TAAAAATATC ACTTCTCA6C CTACTAGTAT AAGAAGCCGC CACTCGGGAT 1800 

GCAAAAAATC CAACTGCAAC TACGGACAAG AAGGCAAGAA AGGACATTCC CATCATCATG 1860 

CTTGCCGACT GCCACAACTC ATCTAAATTA GTTTCTTGAC TACCTAGCAA ATCCGTAATT 1920 
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TTCGAGATAT AGGTCGGCAC TTCCAACTCT AGATAGACCX5 AAAAGCAAGT AAA6AGAATG 1980 

GCTAGTAAAA TCATCCCCCA TTCTTTTCTA CTAATTCTTT TGGCTAATTT CTTTATTCTC 2040 

TCCTCCTATT CCCTTGATAT TTTGCCTGTA GTTGACCGAG AACCTTCTCA AAAATCAGTA 2100 

ATTCATCTTC ATCAATGTCT TCCATCAACT GCTTGTCTAT GCGTTCAAAA AAAGCCTTAA 2160 

CCTGTTGCAT CTGAGAACGT GCTTTGTCCG TCAGACGAAC AAACTTAGCC CGCTTATCAA 2220 

CAGGACTCGC CTCCAATTCC ACCAAACCAT TTTGCACTAT ACGCTTAACC AGATTACTAG 2280 

CAACAGGCTT GGTAATATTG AGTTCCTGCT CGATATCTTT AATCAAGACC AAGTCTTGGT 2340 

TTTTCTCGCG ATTATCCAAA AAACGCACAA CCTGACCTTG CGGCCCACCC ATAAATTCAA 2400 

TGCCGCAACG TTTGGCTTCC TTTTGCACCA TCAGGTGAAT TTGATGACCA AAACGCTTAA 2460 

AGACTAACAT CGGTTTATCC ATAATCTCCC CCTTCTAAAT AAAAATAGTT CTCTGGAGAA 2520 

TAATTAAATT TCTATGAGAA CTATTTTCTT GATTAAAAAA ATCCCAAGTG ATTTTCTCAC 2580 

TTAGGATCAT GTTCTATAGG TTAAATTAAA ACCCATCTAC GTTCGTATAA ATCTTTTGGA 2640 

CGTCTTCGTC GTCTTCAAGA ACGCTGTAAA GTTTTTCAAA GGTTTCAAGG TCTTCGCCTG 2700 

ACAATTCCAC TTCTGACTGA GGAATCATTT CCAATTCAGT CACTTGGAAT TCTTCAATAC 2760 

CAGACTCACG GAGGGCAACG ATAGCCTTGT GAAGGTCAGT TGGCGCTGTG TAAACTGTGA 2820 

TTGTACCTTC TTGTGCTTCT ACGTCATCCA CATCCACATC CGCTTCGAGC AATTGCTCAA 2880 

AGACTGCGTC CGCATCTTCA CCTCCAAATA CAATAACACC TTTGTTGTCA AAGAGGTAAG 2940 

AAACAGAACC TGAAGCGCCC ATGTTTCCGC CGTTTTTACC AAAGGCTGCA CGGACATTGG 3000 

CTGCTGTACG GTTGACGTTA GAAGTCAAAG TATCCACAAT TAGCATAGAG CCATTTGGCC 3060 

CAAAACCTTC GTAACGTCCT TCTGTAAAGG TTTCGTCTGT GTTTCCTTTG GCTTTATCAA 3120 

TCGCTTTATC GATAATGTGT TTTGGCACTT GGGCTTGTTT AGCACGGTCG ATAACGAATT 3180 

TCAAAGCTGA GTTTGATTCT GGATCTGGAT CACCTTTTTT AGCTGCTACA TAGATTTCTA 3240 

CACCAAATTT TGCATATACT TTAGAGTTAG CTCCATCTTT AGCCGTTTTC TTGGCTACGA 3300 

TATTGGCCCA TTTACGTCCC ATTAGGAATC TCCTTTTTTC ACATTTTAAT CTTTCTTATT 3360 

ATAACACAAG TTTTTTTGAT TTTCACTAGA GGAAATGGAT TTTATTAGCA AATCAAGCTA 3420 

GGATAGCACT TTACCTGCTA AGATGGTCTT GCCTTTCTAT CTTTATCAAC AGGCACTCAT 3480 

CCACATTCAA AAAACAAACT AGACCATTAT CTGCAAATAG AAAGTTTCAG CCAAGTTTGA 3540 

CAAAGTCAGC TCAAATTACT GTTTGAAGTT TGTAGATATA AGCGACAAAA ACAATCATAC 3600 

TGCACCTTTT GTTGACAGTC TACTCCAGAC ATATCATAGT TCAAGTAAAT ACTTTGAAAT 3660 
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TCAACAGTTC TTATAGGCGC TATTGTATTC TAAGAAATCA ATAGAAGAGT TTCTAAGCAA 3720 

ACCTCTAATA CTCAATAAAA ATCAAAGAGC AAACTAGAAA GCTAGCCTCA GGTTGCTCAA 3780 

AACACTGTTT TGAGGTTGCG GATGGGGCTG ACATGGTTTG AAGAGATTTT CGAAGAGTAT 3840 

AATTTACGTG TTCCCAAGAT GGAGAAGTTA GACTAGTACA CTGGCACTTC TAAAACATTG 3900 

CTAGCAATTG ATTTGTTCAT ATTTAATTTC ATTTTTTCCA TAAATGGGTA TTAGATATAA 3960 

ACAGCAAAAT ATTTCCGATA CGTGTCGTTC TTGAATTTCC AATCATCTAA AACAAGTAAA 4020 

GGATAATCAA TCCCCTGTAT ATCAAGGAAT TGGCTACCCT TTTTACTTTT TTACACATTC 4080 

TGTTTGATAG ATTCATTTTA ACATCACGAG CATACTCCAA TGGAAATCGC TAGGCAAGAG 4140 

ATAAACTTTC AGATATCCGC AGAGAGATCA TCGCCTCTTT TTGTCGCAAG CATTCTCCTC 4200 

TCCTAGTCAT TTTCTACCTT ATCTTCTACC TGAGGATAGA GAGTTGTTCC CCAAATAGAA 4260 

ATCGTCCGCT TACGCACTAG TGGCAAATCG GTTTTTTCAT AAACCGTACG CCACCATTCC 4320 

CAGGCAAGCC CGGTACACTC TCTAATTTTG ACAGAGAGAT TACGAACATT CCCTTTTAAA 4380 

GGAATACTAG TGGTAAAGTG AGCCGTTAAA TCCTGCCCAT TTCTGTCCCA AGCCTTAGGA 444 0 

GTCAAGACTT CCTTACCTTG ATGATCATAG GATAATTCAT TCCAAGTAAT ATAATATTGG 4500 

GCAACATAGG CACCACTATG ATCCAGCAGT AAATCTCCGT TTCTGTAAGC TGTAACCTTA 4560 

GTCTCAACAT AGTCTGTACT ATTTTGAAAG GTCGCAACTA CATTGTCACG TAAAAAAGAA 4620 

GTTGTATAGG AAATCGGCAA GCCTGGATGA TCTGCTGTAA AGCGACTGCC TTCTTGAATC 4680 

AAGTCCTCTA CCATATCCAC CTTGCCTGTT ACAACTCGGG CACCCGAACT TGGGTCGCCC 4740 

CCTAAAATAA CCGCCTTCAC TTCTGTATTG TCCAAAATCT GTTTCCACTC TGTCTGAGGA 4800 

GCTACCTTGA CTCCTTTTAT CAAAGCTTCA AAAGCAGCCT CTACTTCATC ACTCTTACTC 4860 

GTGGTTTCCA ACTTGAGATA GACTTGGCGC CCATAAGCAA CACTCGAAAT ATAGACCAAA 4920 

GGACGCTCTG CAGAAATTCC TCTCTGTTTT AAATCCTCTA CCGTTACAGT ATCTTGAAAC 4980 

ACATCTCCTG GATTTTTAAC AGCATCTACG CTGACTGTAT AATAAATCTG CTTAAAATTA 5040 

ACAATCTGAA TCTGCTTTTC GCCTGAATGG ACAGAGTTAA AATCAATATC AAGAGAATTC 5100 

CCTGTCTTTT CAAAGTCAGA ACCAAACTTG ACCTTGAGTT GTTCCATGCT GTGAGCCGTG 5160 

ATTTTTTCAT ACTGCATTCT AGCTGGGACA TTATTGACCT GACCATAATC TTGATGCCAC 5220 

TTAGCCAACA AATCGTTTAC CGCTCCGCGA ACACTTGAAT TGCTGGGGTC TTCCACTTGG 5280 

AGAAAGCTAT CGCTACTTGC CAAACCAGGC AAATCAATAC TATAAGTCAT CGGAGCACGA 5340 

TCGACCGCAA GAAGAGTGGG ATTATTCTCT AACAAGGTCT CATCCACTAC GAGAAGTGCT 5400 

CCAGQATAGA GGCGACTGTC GTTGGTAGCT GTTACAGAAA TATCACTTGT ATTTGTCGAC 5460 
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AAGCTCCGCT TCTTTCTTTC GATAACAACA AACTCATCGG GTAGCTGATT ACCCTCTTTG 5520 

ATGAAACGAT TTTCAATACT TTCTCCCTGA TGQGTCAT^A GTTTCTTTTT ATCGTAATTC 5580 

ATAGCTAGTA TAAAGTCATT TACTGCTTTA TTTGCCATCT TCTACCTCCT AATAAGTTCC 5640 

TGGATTGAGT TGCATAAACT CAGACTTGTT CAGCGAAATC AGCCGTGGTT 6GACTAAGTA 5700 

ATCCAAAATT TCCTCGTACA ATTCTTCTGA GACATTGCGT CX3CCGTCTGG CTAAATAAGA 5760 

AGTCGGAATG ACCGTATTAT CCAACATAAA TACCTTATCT AAGTCAATCA AGGTTGGTCT 5820 

TGTAAAAGGA TTACGAGCTA GATCCX3GCTC TTCTATCATA AAGTTCTTGA CCAAACGTCT 5680 

GGTCAAGAGA GCTGGTTTGA AGGTCTGATT TTTAACCAAC TCTTTGTTTT TAGTCATGCT 5940 

GTTGTCAATA CAGATATACA TATGATTCTT CACAGCCAAA TCGCTACTAA TAGTCGGAAA 6000 

AGGCAAATAA AGAGCTACAA CATCTCCTCT CTTAATCAAG CAA6AGCACC CCCTTTTCTC 6060 

CTAATGTAAC ATAGACAGGA TTGACCAAGT CTTCTGATTG ACTCAGAATT TCCAAAGTTT 6120 

GAGTTTGGCG CGCTGTCAAT TTAGTAGCAT CTTGTCTCTT CAATACAAAA TGCTTGTC6C 6180 

CAATAACCTT GACAATATAA TCCTTCTCCA AAGCTGACTG GTAAATCCAC ATCAGATGTT 6240 

GTCTGTCCTG AGAACTCAAG AGAGAAGGAT TTTCAAGCCT CCCGATAGTC TGATAAAAAT 6300 

CAAAAACAGG AGCTAACTCC TGCCAATCTG ATTGGCTAGT TGTCAAGGCT AGAAAAAGGG 6360 

CTTTGCGAGC TGATACTTCT TGGTTAGCCT TGAGAGTTAC TTTCCCCTCC AAGTTTTTTA 6420 

GAAATCGGGA AACTCCAGAA AGCAAATTTT TCTCTAACTG CGAGAAATAA AAACCTTTCG 6480 

TTCCCAGACA TAAGTCTTTC ATGTCGCTTT CTCTAGCAAA TAAGAGCTCA AACATTTGAT 6540 

AGTAAAAGAA AAATATCTGG CACTGGGTCG CGCTCATCTT TTCCTTATCG GCTTCTTTTT 6600 

TTAACCAGAG CAAG6GCGAC AGGTAGCTGG ATTGAGACAT TTCCTCTACXr TCCTACTCTT 6660 

TTTTAACTGG AGCATCTGCA CTAGCTGCCA CTTCTTTTGA CTGGATACTT TCCCACTGGT 6720 

TAATCTCCTC TGAGATAA6A CCTTCGCATG TCTT6ACAAA TAGGGCAAAA GCCTTGGTCT 6780 

TTCCTGCATA TTTCTCCGTT TGGCATTGAT AGAGGAATTT TTCTTTCTCC AGGAGTTGCX5 6840 

CAGTTTTTTG GTAAGAAATC CAATTTTCCT TTGCATTATA CAAATTGATA ATCCCCTCAC 6900 

ACAGCAAGCC GAGACTGGAT AAGGCAACCG AAATCAAACG GTAGCGATCA CCTGGCATAG 6960 

GAATAGCACA AAAGACAGCT ATGAGGAAAC CTGCCACGAT TTCTGTTATT TTTAATACCT 7020 

TATAGCGCCT ACGATGTTGA ACGCTTTTCT TTAAAAAATG AGCTATCTGT ACGTCTAATC 7080 

GCTCTGTCAG GTACATTTCT TCTGGCGTCA TATTCGTAAC TCCTTTCATT TACTTTGATA 7140 

ATCAGG6 7147 
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<2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 755 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



CCGCATGGGA 


TTGGTGTCCT 


TTTGGGCAAT 


CTCTTTGACC 


AAACTGGAAA 


CATGTTTTAT 


60 


GCGCCTGCCT 


TTACTGCCCT 


TGTCGGCGGT 


ACGTCTATAT 


GATCCTAGTC 


GCAAAAGTTC 


120 


CGCGCTTTGG 


AGCCATTACC 


ACTATCGGCC 


TTGTCATTGC 


CCTCTTTTTC TTQGGAACTA 


180 


AACACGGTGC 


TG6TTCCTTC 


CTTCCTGGAA 


TTATCTGTGG 


CCTCCTAGCA 


GATGGAGTAG 


240 


CTCATTTAGG 


AAAATACAAG 


GACAAAACAA 


AGAACTTCCT 


TTCTTTCATT 


ATTTTCGCCT 


300 


TTAGTACAAC 


AGGACCAATC 


TTGCTTATGT 


GGATTGCGCC 


CAAAGCCTAT 


ATGGCTACTC 


360 


TTCTGGC7UVG 


AGGAAAATCC 


CAAGAATATA 


TCGACCGTAT 


CATGGTCGCT 


CCAAACCCTG 


420 


GAACTGTCCT 


TCTATTTATC 


GCAAGTATTG 


TCATCGGAGC 


CCTAGTGGGT 


GCCTTGATTG 


480 


GACAAGCCTT 


GAGTAAAAAA 


TTTGCCCAGA 


AAATCTGATC 


AGTTAAAAAG 


AGCCACGCGG 


540 


CTCTTTTTTA 


TTTATGGCTC 


AATTTCTTAG 


TCAAGAAATC 


TCCCAAGAAT 


TGGATTGCAA 


600 


AGATAATCAA 


AATGATAATA 


ATGGTTGCCA 


AGATG6TCAC 


ATCGTGATTG 


TAGCGGTTAA 


660 


ATCCATAAGC 


GATGGCTACG 


TTACCGATAC 


CACCAGCTCC 


AACCGCACCG 


GCCATAGCTG 


720 


TTtcCCAACA 


AGGGaAtCAA 


GGTcACAGTC 


6TCAC 






755 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

TTCAATTGGT ATCTCAATCA ACGGTCTTCA CATGGTTTCA ACTGGTTTGA CTCTTGAAAA 60 

AGCGAAAGCT GCTGGTTACA ACGCAACTGA AACAfiGCTTT AACGATCTTC AAAAACCAGA 120 

ATTCATGAAA CATGACAACC ATGAAGTAGC AATTAAGATT GTCTTTGACA AAGATAGCCG 180 

TGAAATTCTT GGTGCCCAAA TGGTTTCACA TGATATTGCA ATTAGCATGG GAATCCACAT 240 

6TTCTCACTT GCTATCCAAG AGCATGTGAC AATTGATAAA TTGGCATTGA CAGACCTCTT 300 
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CTTCTTGCCA CACTTCAACA AACCATACAA CTACATCACA ATGGCTGCCX: TTACGGCTGA 360 

AAATTAAAAA TGAATGAGCT ATCTGGCCTT AAGTTAAGGT CAGATAGTTT TTAGCTAATT 420 

TGTCCCCATA CAATTATAGT TTTTTTATCT TGTGCTTCAT TCTGTTCTGA CTTAAAATGA 480 

AAAGGTAGCT ACCAATACAA ATGATGA6GA TAAAACAAAT GACTGAAAAT CGTTATGAAC 540 

TAAATAAAAA CTTGGCACAG ATGCTCAAGG GTGGTGTTAT TATGGATGTG CAGAATCCTG 600 

AACAGGCTCG TATCGCAGAA GCTGCTGGTG CGGCAGCTGT GATGGCCTTG GAACGAATTC 660 

CGGCTGATAT TCGTGCAGCT GGAGGAGTTT CCCGCATGAG CGACCCAAAG ATGATTAAGG 720 

AAATCCAAGA AGCGGTTAGT ATTCCAGTAA TGGCTAAGGT CAGAATCGGG CATTTTGTTG 780 

AAGCTCAGAT TTTA6AGGCT ATTGAAATTG ATTATATCGA CGA6AGTGAA GTTCTATCTC 840 

CAGCTGATGA CCGTTTCCAT GTGGACAAGA AAGAATTCCA AGTTCCTTTT GTCTGTGGTG 900 

CTAAGGATTT GGGTGAA6CC TTGCGTC3GTA TCGCTGAAG6 T6CTTCCATG ATTCGTACCA 960 

AAGGAGAACC AGGGACAGGG GATATCGTCC AAGCTGTTCX3 TCATATGCGT ATGATGAATC 1020 

AGGAAATTC6 CCGCATTCAA AACTTACGTG AGGACGAGCT TTATGTTGCT GCCAAGGATT 1080 

TGCAAGTCCC TGTAGAATTG GTCCAATATG TTCATGAACA TGGAAAATTG CCAGTTGTAA 1140 

ATTTCGCTGC TGGAGGTGTT GCAACGCCAG CAGATGCTGC GTTAATGATG CAATTAGGGG 1200 

CAGAGGGGGT CTTTGTCGGT TCAGGTATTT TCAAGTCAGG AGATCCTGTT AAACGAGCGA 1260 

GTGCCATTGT TAAGGCTGTG ACTAACTTCC GTAATCCTCA AATCCTAGCT CAAATCTCTG 1320 

AAGATTTAGG AGAAGCCATG GTTGGTATTA ATGAAAATGA AATCCAAATT CTCATGGCTG 1380 

AACGAGGAAA ATA6AT6AAA ATCGGAATAT TGGCCTTGCA AGGGGCCTTT GCAGAACATG 1440 

CAAAAGTGCT AGATCAATTA OGTGTCX3AGA GTGTAGAACT CAGAAATCTA GATGATTTTC 1500 

AGCAAGATCA GAGTGACTTG TCGGGTTTGA TTrTGCCTGG TGGTGAGTCT ACAACCATGG 1560 

GCAAGCTCTT ACGTGACCAG AACATGCTAC TTCCCATCCG AGAAGCCATT CTATCTGGCT 1620 

TACCAGT6TT TGGGACCTGT GCX;GGCTTAA TTTTGCTGGC TAAGGAAATC ACTTCTCAGA 1680 

AAGAGAGTCA TCTAGGAACT ATGGATATGG TGGTCGAGCG TAATGCTTAT GGGCGCCAAT 1740 

TAGGAAGTTT CTACACGGAA GCAGAATGTA AGGGAGTTGG CAAGATTCCA ATGACCTTTA 1800 

TCCGTGGTCC GATTATCAGT AGTGTTGGTG AGGGTGTAGA AATTTTAGCA ACAGTGAACA 1860 

ATCAAATTGT TGCAGCCCAA GAAAAAAATA TGTTGGTAAG TTCTTTTCAT CCAGAATTGA 1920 

CTGATGATGT GCGCTTGCAC CAGTACTTTA TCAATATGTG TAAAGAAAAA AGTTGAGATT 1980 

GAATTTCTCA ACTTTTTTAC ATCTAATAAA CAATAGCGAT GTATTGAAGT GCGGACGCAG 2040 
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CTAGGATAAA GAGATGCCAA ATCATGTGGA AAT/U^GGTTT TTTCTTGGCA TAAAATCCAG 2100 

CTCCAACTGT ATAACAGAGT CCGCCA6TTA CCATGAGACT CCAGAAAACG GGTGTCGTTT 2160 

GACTGATAAT GGCAGGAATG ATAGCCAGAA CCAACCAGCC CATAATCAGG TAAAGAGCAA 2220 

GGCTAAATTT CTCATTGACC TTTTTAGCAA AGATTTTATA GAGAATACCA AAGATGGTCG 2280 

TTCCCCATTG GATGACAATA ATCAGATAGC CAAACCAGTT ATTCATCAAG GTCAAGACAA 2340 

CGGGCGTGTA TGAGCCGGCA ATGGCAACGT AAATCATAGA ATGGTCAATG ATTCGCAAAA 24 00 

CATATTTGTG GGTCGAACCA TAGGCCATA6 AGTGATAAAT GGTGGATGAT AGGAACATGA 2460 

GAAAGAGACT GATGACGAAA ATGGAAACGC CGATAGAGGA TAAAAATCCG T6TGCTTCAT 2520 

AACTATAGAT GGATGAAATA GGCA6CAAGA TAAGCATGAT GACTGCACCC ACAGCATGGG 2580 

TCACGCTATT AGCAATCTCC TCTCCAAAAC TGAGTTGTTT GCTGAGTTTA AGACTAGTGT 2640 

TCATTGGATT ACCTCCTCTT GAGTAT6ATC GATTAAGTCT AGAGTTTGAT GATAGAGTTT 2700 

AACGGTTTGG CAGCTGGTTT GGATAATAGG GTTAGCTGGG TCAATTCCTT GGTTCATGTA 2760 

GTCCACAAAA GCATCGTAGA GTTGGTCTGA ACTTGCTTGA GTTTGTAGAG TATTAAGTGT 2820 

CTGGGCTATT TCTTGAATAG AAAATACAGA CTTGAGGGTT GTGATAGCAA TCAAACGGGC 2880 

AATCTGTTGG CGTTGGTATT TTTTTTTGTC AGGCTTTGTC AGGTAACCAT TTTTCACATA 2940 

ATTGTTGACC ATAGATCCTG TTAGGCCCTT GTCTTTATTA GGAGAGATAG GGGCGCAGAC 3000 

CTGATTGACA 3010 
(2) IKFORMATIQN FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15213 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: double 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

CATAAATCGG TGCAAATAAC TTAATAGTGA AGTAGCCATT TCTTTCGTAT TTACCTGAGG 60 

CATATTCCCT AGACGAAAGA ATATTATTAT CAATCAAATC ATTGAATGAA CGTAGTCTTT 120 

CAACTTCTTC TACTGTTAGA TTTCTGACAA CATTTGTTGC ATAGACCTTA TTTCCATCAG 180 

GATCAGGATG GTACTCATTT GTAACTTTTC TAAGAAGTTG TTGTTTTTGA TTCGTATCCA 240 

ATTTAAGAAT TGAATTTCCT TCGAGATATT CCAACATATA AACAACGTCA AACATGTTGT 300 

GGACATATTG CTTCAAATCA TCTGCATTAT TAAATCTTGT AGTTGGATCA AGTACTTGTA 360 

ATCGTCGACT TTCTQTACTA TCAGATTTTG AATGTTTCAA GATG6AGTTG ATGGTAATGG 420 
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TCGCATCATC TGGATGGTCT GGTGCTTGTA ATAATCCTTT AGCAAAGAAC TCTGGTCCCA 480 

AGCCACTTCT TCGACCATAT CCTCCAAGAT AAATGTCCTG ATCTGAGTCA TGTGTCATCT 540 

CATGCGTATA AGTAATAGCT CCATCCTTAT CCAACATTCG ATAACCCATA TAATAAACTG 600 

CATCACCTGT AGCATAAGCA CCX3TGTTGAT TATGCCCAAC TTTATTTCCA ACAGGTCCAA 660 

AGAAATGTTG CATTGCAGGA TTTGGATTAT CAAAATCTGC CACTTCTGTA GCTTTCCCTA 720 

CGGTATTATC ATCGCCAAAT TTATAAGCAT CGTAAAGCAA AATATTTCTA TAAAGTTTTT 780 

CACGTGCATT GTCGTCTAAA ATACGATACC AATAATCGTA GTGATCTCGC TGACGTTTGG 840 

CTGTTTCACG CGCATTTTCT TCAACAAAAT CATTGAGAGC CTTGCCCGCT TTATGGTCAC 900 

TACTGCGGTA GCGATCATAA GCTCCAAATC CTAGACTAGA CATGGTCGAG ATGACAAATA 960 

CGGATCTCTC TGGCAAGGTC AGGAGAGGCA AGACXIATATT GCGGTATTTC CATGTGGCAC 1020 

TCGTGATACG ATCATAAACA CCGATAGAAT ACTTGGTGCC AGCTAACCCT TGCTTCGTTT 1080 

TCACCTCTTC GATAGTGGAT TTTTCTTCGA CAATGTAAGC CTTAGTCTCT GATTTAAACC 1140 

AGTCATTATT GCTTGTATTT GGTAAAAA6A CTTTTCGGTA ATGTTCCAGC GTGCTAAACA 1200 

AATCTGTCGT TCCATGTTGA CTGGCAAGAC TGATACCATA AGTATCGACA TTATTCTTA6 1260 

CTAGAAGATT GTTAAAGCCA GATTTACCCA ACTCAATCAG AGTATCTAAT GGTGAAGCAT 1320 

TCCCCTTACC AAAGAAGTCC AAATGGTACA GAACTAGGTC TTTGACATTC ACCTGACCAT 1380 

AGCTAAAGTT ATACCACCGT TCCAGATAGG TCAAGCCAAG TAGCAAGGCT TCCTTGTTGC 1440 

GTTTGATTTT ATCTACAAGA TAACCTTCAG TGACGGGGTT AGCACTAGCC AGTCCAGCAT 1500 

CC6CTGACAA GAGTTTTTTC AAACTGTCTT CCAGTTGTTG TTTTGTTTTG GCGAACTGGT 1560 

CTTCTAGATA GAGCTCAGTT TGCTTGACGT TTGGAGAAAT ACCCAGCGTC TTTCTGATGG 1620 

CTTCTGAATC ATAGTCAACC TTTTGTAAGT CAGGTAAGAC TTGCTTGATG ATAGAGGTTT 1680 

GGTCATACAG GAATTGGTTT GGCGTATAGA GAAGTCCAGT ATTGCCCAGA CTATATTCTG 1740 

CTAATTTGGC GAAATCATTC TGGTATTTGA GATCCAGCTT CTCAGATAAA TCATCCTTGT 1800 

AGTGAAGCAA GAGTTTGTTT GCAGTCTGTT TGTTAGAAAC AATGTCTGTG ATGACTTGGT 1860 

TGTCCTTCAT CATGACTGCT GACAAGAGTT CTTTTTGATA TAAAAGACTG TTCTCATTGA 1920 

CCAGGTTTCC GTATTTGACG ATGGTTGCCT TGTTGTAGAA AGGTAGCAAT TTTTCAATGT 1980 

TTTTATAAGT CAAGTTGCGC TTAGCTTGAT AATAGGCCAC CTTAGAAAAA TCACTGTCTT 2040 

TTTTGCCACT TGTTGAAAGT GGCTCCACTG TTGGTAAAAT GAGAGGATTG ATTTCTGCTT 2100 

TTTTGCTTGC AATTTGAGAA GCATCTAGCA TTGTTCCTCT TTCTTCAAAG GATTCCTTGC 2160 
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TGACGACCTC ATCCTTGACC AAGGTGACAT TGTAGACTCT GTTGGCCTTG CTGCTGAATG 2220 

TGTCCTTTAC CTTCATTTCG TTATAGTGGT AACCAGTGAT GGCATTTCCG TTGGTTACAT 2280 

TAACATCGCT GAGAACATTG GTCAAACTTC CAGCATGCCT AACATCACCA GAAGTTCGAT 2340 

CCCACAAATT GCCTGCCACT CCAGCGACTC TACCAAAGTG CTTGACATTG TTGATATCAC 2400 

CTTCAGCATA GCTATCTTGG ATCTGTGCAT CTCGGTCTAC TAGGCCTGCA AGTCCACCCA 2460 

CAGTCTGATC TGAAGTATTT GTGTTAGATG AAATGGCTAC TGTCGCTTTT GACTTAGTAA 2520 

GTAAAGCCTT GTCACCTGTC AAATGACCGA CCATACCACC GATATTGTAG GCAGCAGTCG 2580 

TTTCATAAGT GTTGATAATT CTTCCCTTGA AACTGCTCTC TGTGATGCTT GATTGCTCAG 2640 

CCTTAGCCAG CAAACCACCG ATACCACGTT CACCAGCCAG AACACCATCG ACGTGAACTT 2700 

GCTTAATTTT TGTGTTATTC TGAGCTTCAT TTGCCAGTGA ACCGATATCA TCTTTCCCTG 2760 

AAATAGCAAC ATTTTTTAGA CTCAGTTTTT CTACT6TAGC ACCACTCAAG TTTTCAAACA 2820 

GAGGTTTTTT CAAATTATAG ATAGCATAAT TCTTGCCATC TTTTTCACCG ATTAAACGAC 2880 

CAGTAAAGGT GTCCTTGATA TAGGATCTTT CATCAGGACC AAGCTCCACT TCGTTAGCAT 2940 

TCAGGCTGGC CGCTAAATGA TAGGTTCCAG AGGGATTTTG GTTTATAGCT TTGACCAGAT 3000 

TACTAAAGGA AGTAAAGTTT GTTGTTTCTT CTGTTCCCTT CTTAGCTAGA TAGAAGGTAA 3060 

AATTATCTTT ATATCTGCTT TCTATCTCCT GCTGAAGCTT CTCTACTTTT GCTGTGATTT 3120 

TATAAAGGAT TTTATCATTT TTTCTTTCCT CTGATATTGA TGCTACTGGT AGGTATACAT 3180 

CTTTGAATGA AGAAGATTTC ACTTTAACAA AGTAGCTATT TGGATTGCTT GGAACTTGCT 3240 

CTAACGAAAT GTGTTGTTTA TAAGTACCAT TTGACAAACT GTATAACTCT AGGTCGGAAA 3300 

CATTTCTTAA TTCAAGTGTT TTCTCTGGTT CTTCTACCTT TTTATCAGGG TCTAGTTCAT 3360 

TTTCTTGTTT AATTTCTTCG TTTCCATTT6 AATTGGATGT GTTTGATTCG GTTGAAACAT 3420 

CCTCAGTTGA ATTTCCGTTT GATGGTTCTG GTTCTGTTTG TCXATTCTCT GATGTTGTAT 3480 

TACCTGAATT TTCTGGTTTT GTTGCAGTTC CGTTTTTTTC TGGTTGATTT GATTCTTCAA 3540 

CTGGTGGTTT TGAATCACTA GGTTTATTGG ATACTTCTCC AGTATTTTCG TTAGCTATTT 3600 

TCCCAGAGTT TGTTTGTGTT TCTTCTGCAG GTTGAACTGG TTTTTCTGTT TCTTGATTTG 3660 

AGGTACCTTC TACTGTGCCT TCATTTGGAT TTACTGGAAC TTCTTCTACA GTTTTTTCTG 3720 

AATTTTCATT TTTAGAGTCA TTATGTTCT6 GTTTATTTGA TTCTCCAACT GAGGTTGTCG 3780 

AATCACTAGG ATTACTGGAC ACTTCCCCAG TATTTTTGCT AGATGTATCT GGTGATACTT 3840 

TCTCTGAATT CGTTGTTGAT TCTTCTGCAG GTTGAACTGG ATTTTCTGCT TCTTGAATTG 3900 

AGGTTCCTTC TGTAGTACCT TCATTTGGAT TTACTG6TGT TTCTTCTGTT GGTTTTACTG 3960 
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GAACTTCTTC 


AGTTTTTTCT 


GGACCTTGTT 


CTTTGGTCTT 


CTCAACCGGA 


GTTTCAGGTT 


4020 


TTACTTGCTC 


AATATTACCC 


TTATATTCTG 


GAAGCGGTGC 


TACCTGCTCT 


GGTTCACCTT 


4080 


TATCACTTAC 


CACAGTATCT 


GGCGACTCTG 


GTTGAACCTC 


AGTCTCACCT 


TTGTCGGTCA 


4140 


CAACTGCTTC 


GGGTAATGTA 


GGTTGAACTT 


CTGGTTCGCC 


TTTGTCACTT 


ACTACAGCTT 


4200 


CGGGCAACTC 


AGGCTGAATT 


GCGGGTTCAA 


CAATAGCTCC 


AGACTGTACG 


TCCTTATGTT 


4260 


CTACACCAGT 


CTCAGGTTGT 


TCCTTTATAA 


CTTGAGTTTT 


TTTAGTACCT 


TTTTCGACTA 


4320 


TTCTTGGACT 


AGGCGCAGTC 


GTTGAAGTTG 


AAACAATTTC 


TCGCGAAACT 


TCTTCCTTGT 


4380 


TTACAGAGAA 


TATTCTGACG 


ATTTCAACTT 


TCTTACCTAA 


TTTACCTTCT 


TGTTTTACTC 


4440 


TTACAGTTCC 


TTCAGCTAAA 


TCAGGATTTT 


CTTGAATTTC 


TTCTTGAAAA 


TCTATTTTTG 


4500 


TCTCCATAGT 


TTCCTCACGA TATAAGAGTT 


CAGGTTTGTT 


CAATTGACCT 


GATAAAACTT 


4560 


CATCCTGTGG ATTTAATGTA TTTACCCCAG TCTTTTCTTT 


TGGAGAAATC 


TTCTCCTCTT 


4620 


TCTTCGTTTC TAGATTCTTA TGTTCGGCTA ATTGTTCTTG AGAATCT6AA GATTGTTTCT 


4680 


CTTCTTTTCT 


TGGATTGATT 


AATTCAGTAG 


AGAAAGGTTT 


TTCAACTACT 


TGAACTTCTG 


4740 


TCGGCTTAGT 


TGAAGAAACA GGTGTTTGTT 


CCTGAATAGC 


TTGTACTGTT 


GATGGATGGT 


4800 


CTACAAAATT 


CGGTGTAACA 


TTATAATCCA 


CCTTTTGTTG 


TTTTGTAGGA 


GTGGCAACTG 


4860 


AACTCTTTTG 


ATTACTTACT 


TCAGACTCAG 


AAGTCGTTTT 


TCCCTCTTTG 


ATATATCCAA 


4920 


TATAAGTGTA 


ACCTGAAATC 


TCTTTAGGAA 


GAGGTAATTT 


TTCTCCAGAG 


GTCAATTCAT 


4980 


AGTCCGTATT 


GTAATTTAGC AAAAGATGAT 


TTTCTAAAGC 


ATGGACTGAA 


ACTAAGACAC 


5040 


CATTTCCTAT 


CCCTGCAACC 


AATACTAAAT 


GTAATACCGT 


TTTATTCTTA 


ACCTTTTTCT 


5100 


T6GAAACAGC 


AAAAATTAAA ATTCCCATAG 


CAGCTAAGCT AGCACCA6CA 


ACTAGGGCTT 


5160 


GCCTCTCATT 


CTTGCTTCCA GTATTTGGCA ATTCCGCCAG 


TTGATTTTGA 


GAATTTAACT 


5220 


TATAAACAAG ATAATAAGTT TCATCATCAT TCTCCACGTA TGTCGGAATA TCATAGACAA 


5280 


GCTGCTTCTT 


TTCTTCTGAT GATAGCTCTG 


AATCTGCCAC 


ATATTTATAG 


TGAACTCCC6 


5340 


CAGTTTCTTG 


AGCATCCACA 


GATGAACTAG 


CTAATACAGA 


CATAAAAAAT 


AAACTTGAAA 


5400 


TCGTTGCAGA 


TACAAGTCCT 


ACTGATAATT 


TTCTAAATGA 


AAAACGCTCT 


TGTTTTTCAC 


5460 


CAAAATACTT 


TTCCATTATT 


CCTCCTTGAA 


ATAAAATTTA 


TATATGTTAC 


AAAGACCTTT 


5520 


ATTATATTAG TGTATTATCT ATTATCTATA GAAAAGGCAG 


TATACCTTAA TTATACTCTT 


5580 


AATTTACAAA 


AAAGTCTTAA 


AATTGAGATG 


CGCTTTCATA 


CTTTGTTTTA 


TATTATTTGG 


5640 


AGGTACAATA ACACCTACCA TGAAATTTAC ACGGTAGGTG TTACTCATAT CACTAATCGT 


5700 
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TCTAAAAATG GTTTGAGGCA GTTGAGGAGA ATTCCTTCTA TCCAGCTTCC TTGTGCTGAT 5760 

GAGCGATGGT CTTCCTGCAG GCTTTTTTTT AGAAAATCTC GGACTTGTTC TGGTGCGATT 5820 

TCAAATTCAA AGGCTTTCAT TTTATAGAAA AAGTCGATGA GATGATCTGA CAGGTATTCA 5880 

GTTGAAAAGG GTACTTCACC ACTTTTTCTA TATTCTAATA AGAGTCTAGA AAATCGAGCT 5940 

TTTTCTTCAG GAAGCTCACG AAAATAGGAA TTGAGGATCC AAGTCTGCTT CTGTTTTCTT 6000 

TCAATTGGAT CCTGACTGGC AATTCGTTGG TCTTTTTCCA GCTCTTTTTG GTATTGTTTG 6060 

GCCTTGATAG CTCGTTCTGC TCTATTTTTA CCAAAAAGAA TTTTTTCCCA CTTGCGTTCT 6120 

TCTTGAGTCA GGGTCTCTGT AAAGCCAAAG TAATCTTGAT AAGCACGCTC TGCGGGTCCC 6180 

ATGGCTAGAA CCAGATTGTC TGCATATTGC TTGGCGATTT TATCCCTCTT CTTGCGTTCT 6240 

TTCTCTGCCT GGATACGGA6 TTCTTGTTCG TAGTCAATTT TCTCCTTGCC TAGCTTGACA 6300 

AGGTAGAGTT GGTCATCCGA TTTCCCAAGT AAAAAGGGTT TGATACACTT TTCAAGGACT 6360 

TCTTCXaVTCC GAGCCTTTTT CTTTGGTTCC GCCTTGGTCC AACTTCCTCC CTGAAAGACT 6420 

TCTAGGAAAA GCTGGTAGTC TCTCTCAGGC GCAAATTGAT TGCCACGATT GGGTTTGAAA 648.0 

ACACCTTTTT CCCAGAGCCA TTTTAGAAGT CGCTCGTCAA AGTTACTTTT ATTGACCTTG 6540 

ATTTTTTCCT TTTTCTGAGC TTTTCTGGTT AGATTTTCAA CCTTTCTGAG CAGTTTTTCT 6600 

TCCTCTTCCA ATTGCTGGTC AAGGGACAAT CGATGAAAAT GACGAACACA GTCGCTACCA 6660 

ATTGGAAAGA GGCGTTGGCC TGTGACACCG TTAAAGAGTT CATAAGCGTA TTTGATGGCA 6720 

TTTCCACAGA CACAATTGCT ACGGCCGATA CCGTTAAAAA TAAAGGAAAC TTCATTCCAT 6780 

TCCTTGGTAG CTTGTTCCCA AGTATCCGCT TTCGAAGCCT GTAAAACTGC ATCGTGCAGG 6840 

GATTTTCTAA CTGGAAGTGT CATGAGGTCT CCTTTCTAAT ACTCAATAAA AATCAAAGAG 6900 

CAAACTAGAA AGCTA6CC6C AATCAGCTCA AAACACTGTT TTGAGGTTGT AGATAGAACT 6960 

GACGAAGTCA GCtCAAAACA CTGTTTTGAG GTTGTGGATA GAACTGACGA AGTCAgTAAC 7020 

CATATATACA GCAAGGCGAA GCTGACGTGG TTTGAAGAGA TTTTCAAAGA GTATAAGTTA 7080 

TACTTTTACA ACTTGAACCT CGTCTTTACC GAGTAAAATC AAGTATTTTT CAATATTTTC 7140 

AATCGAATAG GCTCGTGATA AAGCCTCTTC GTATAGAGCT AACTGACCAC GATAGCGGTC 7200 

TACGAGTTGA CTTGGTTCAT GATAGCGGTC TGTCTTGTAG TCGAACAGAA CAATTTTGTT 7260 

TTCGTAAA6C AGATAGCCAT CAAGGATACC ACGGACAACA AAGTCTTCCT GACTCTTTTG 7320 

GTCTCGTTTO AGCATGGAGA AAG6TT0CTC 6CGATAAAQA TGGTCGGTAT TAGCAAGAAT 7380 

TTCCTGACCG AGTACTGTGT CAAAGAAAGC AAGAATTTTA TCAAGATTGA TCTTGTCTCT 7440 

6ACAGCTTGG CTAGTTTGAA CTTGTTTGAG TGTTTCTGTT AGGCTAGCAA GGGTTA6TTG 7500 
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CTGGCTGAGG TCAATTCTCT GCATGAGTTC GTGAGTAGCA CTACCAATCT CAGCTCCAGT 7560 

TACCTTTTCT TTGGTTGAAA AATCTGGCAA ATCGAAGCTG ATTTTCTTGC CTACTGACTG 7620 

ACCTTGACCA GCAATCTCGA CACCTTCCAT ATCCATAACT GGTTCGTAGA ATTTCTTGAT 7680 

TTGACTTGGG GTTTGAACAC TAGGAAGTTC AATAGCTGCG CGGTGAAGAG TATTATAAAC 7740 

TTCCACCTCC TTCAGCATTT CCAGAGCTTC TTTGATGGTA TCTGACTGAC GATTGTCTGC 7800 

TTGGGAGCTA TCTTGGAGAG GACTCTTQGT TTCCAACTCT CCGATAGCTT CTCTGGTCAA 7860 

CTGATCTTCG CCAATAAAAC GATAACTAAA GTTGAGCTTG TCCTTAGTAA ACACTTTACT 7 920 

GATAGCCCAA AGCCAATCTT GGAAATTCCG TGCTTGCAGT CTAGTATTGC TATTTAGTTT 7 980 

CCCATTTTTG GCTGCTGGGT. ATTCCTTGGA TTCCAGCTTT TCACGAGAAC CCTTGCCGAC 8040 

AAGATAGAGC TTTTTCTCAG CCCGCGTCAT AGCAACATAC AGCAAACGCA TCTGCTCAGA 8100 

ATAGCTTGCT AGCTGTAATT CCTCTTCGTT CTGCCTATAG GTCAGACTAG GAATGGAGAG 8160 

TTTGATGGTT TTAGGATAGT GGTCTTCTAC TGCCCCTGTC TCCATCTTGG CAATATATTT 8220 

GACACCAAGA CCATTCTGAC GACTGAGAAT GACTTCTGAC ATAGAGTCTT GCTTGTTGAA 8280 

ATCTTGATCC ATATTGAGGA TAAAGACGTA AGGAAACTCC AGCCCTTTAC TCTTGTGGAT 8340 

GGTCATGAGC TCTACTGCAT CTTTTGGCGG TGCGACGGCC ACGCTTGCCA AATCGTGCTG 8400 

GGCTTCTAAG ACTTGGTCAA TCATACGAAT AAAACGCGAC AAACCTTTGA AATTGCTCTT 8460 

TTCAAATTGA TCAGCACGCA GTGCTAGGGC ATAGAGATTG GCCTGCCTAG CAGGACCATT 8520 

CGGCAAAGCC CCAACATAGT CATAATAAAA ACGGTCGTTG TAAATCTTCC AAATCAAGTC 85 PO 

ATAGAGAGAG TGGGTTTTGG CATACAAGCG CCAAGAAGCT AGGATATCCA TGAATTGCTT 8640 

TAGTTTTTCA GCTAGAGCTG TGTGAATCAA GCCTTTTTGA CTACTTGCCA TTTTTTGTGC 8700 

ATTGACCAGT TTCTCATAGA GATTTTCGTG r-«:.TT^ATCr TCTGCTTTCT GAAGGGACAA 8760 

ACGTGCTAGC TCATCCTCAT CAAAACCAAA CATTGGAGAC TTCATAAGGG CAACCAAGGC 8820 

GTAGTCTTGC AGGGGATTGT GAATGACACG AAGAGTGTCT AGCATGACTT GCACTTCTAG 8880 

GGATTGGAGA TAATTGTTTT GCTCTCCGTC AGTTTTGACA GGAATTCCGT ACTCAGACAG 8940 

GGCGAGGAGA ATCTGGTCAT TACGACTGCG GCTGGAGGTC AGAAGGGCAA TTTCCTTAAA 9OO0 

GGCAACACCT TTTTCTTGAT GAAGTTTCAG AATCTCCTTG ATAACTAAGC GCATTTCGCC 9060 

TGTTAGTTTC GTTTCTGTTT GACTCTCTTC TTCCTCACCT GTATCGTCCT TGTCGTAGAG 9120 

GAGAAATGCT GCCTTGTTGT CTGGATTGGG AGTCAGTTTG GTATTGGCAA AAACAAGCTG 9180 

GTGCTTGTTA TCATAGTTGA TTTCGCCGAC CTCTTGGTCC ATGAGACGTT CAAAGACATC 9240 
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ATTGGTTGCT GACAGCACTT CTGAACTACT ACGGAAATTT TCCTTGAGGA TAATGAGCCT 9300 

GCCTTCTTGG GGATTTTGCG CATAGCGTTG GAATTTCTCA TTGAAAATCT GCGGGTCTGC 9360 

CTGACGGAAA CGATAGATGG ATTGCTTGAT ATCTCCCACC ATAAAGCGAT TGTGGCCATT 9420 

AGACAACAAT TCCAGCATCC GTTCTTGAAT ATGGTTGGTA TCCTGATACT CATCGACCAT 9480 

GACTTCATGG AAGCGCTCCT GATAAGACTC ACGAACTTGT GGGAAATTCT CTAAAATCTC 9540 

AATGGTGTAA TGGCTGATAT CAGCGAATTC GAAGGCATTT TCCTGTCGTT TTCTCTGACG 9600 

ATAAGCCTCT ACAAAATCGC TCATGAAAGA TTGGAAGGTT TTAGCTAGTT TCCAAGTGTC 9660 

TCCATGATAA CGTTCTTGAT AGTC6AGAAT CGCTATCTGG TCTGATAATT GTCCTAGTTT 9720 

AGCAAACTGG GTCTTTCTCT CTTCGTTGTA GGCATCAGCC AGGGGCTTCA AATCAGCCTA 9780 

CGGCTGGCAT TAGTCAGAGC TCGACCGTTT TTCTCCTTAG AGATGGOSAC AACACGCGCA 9840 

AGCACTGCCT GATAAGCCTG ACTATCGGAC TCCTGATTTA GGGAGCCAAT TTCATCCAGA 9900 

ATTAACTGAA CATTTTCTAA ATAGGCAGCC TTTGCAAACT CCTTGGCATC GTTATCCAGA 9960 

TGGTAACGGA AAAAGCTTTC CAAATCCCAA AGGGCTTGTT TGATTTGCTC GGTCAGTTTT 10020 

TCTTTTTCAC TGGTAAAATC AGCTTTCTCA AATCCTTTGA GGAAAGATTC ACTCAGCCAC 10080 

TTTTGAGGAT TACTGGTGGA TTGGAGGAAG TCATAGATTT TATAGACCTG CTGGCGCAGA 10140 

CCCCGTTCGT CCTTGCCACG CCCA6CAAAG TTTTTCAGCA AATGACTAAA GGTCTCTTTC 10200 

TGTTTACCTT GGTAATGCGC TTCAAAGACC TCATGAAAGA CTTCGTTTTC GAGAATAAGT 10260 

TGCTCGCTTT GGTTTTGTAA AATACGGAAA TTAGGTGCAA TATCAAGCAG ATAACCATGT 10320 

TTGCCAAGGA ATTTTTGTGT GAAAGAATCC ATGGTTCCAA TGGCAGCGTT GGGTAGGTCT 10380 

GCCAACTGGC GACXTCAAGTG TTGTTTGAGG TCGACATCAT CTGTTTCTTG GATTTTCTTG 10440 

CTGATTTTTT TCTCTAAACG TTCTTTAAGT TCAGTTGCAG CCTTGACGGT AAAGGTTGAG 10500 

ATAAAGAGTT GAGAAATTTC GACACCACGC GCCAATTGGT CCAGAATGCG CTCTGCCATG 10560 

ACAAAGGTCT TTCCAGAACC AGCCGATGCT GAGACCAGGA TATTCTGGGC AGAAGTGTAG 10620 

ATAGCTTCGA TTTGCTCGGC AGTTTTCTTC TGTTCCTTGC TCGAATTTGC TTCTGCTTCT 10680 

TGCAGTTTTT GAATCTCCTC CTCACTTAAA AAGGGAATAA GCTTCATCGA TTCAACTCCT 10740 

CTCTTATTTT TTCAAGCCAA GCTTGCTTGA GTTTTTCTCC GACCAGACGC TTGCCATCAG 10800 

CTAGGTCCAA CTTTTCTAGG AAACGGGCTT GGCCCAGATG GTAATTGGCT TCAAAGCCTG 10860 

TAATAGCCTG ATGTTGCTGG ACGTATGGGG CAATGCTTCT GCCATTTTCA GTATAAGGAT 10920 

TGATGGCGAA CCGGCCTGCT AAAATCTTCT CAGCAGCTTT CTTGTAAAGA TAGGCATTGT 10980 

AGTCCAGTAG GAGCTGAAAT TCTTCATCTG TCAGTTGATT AGCCTTGTTT TTGTTATAAA 11040 
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ATTCGCCTAA ATAACTGCTT TCTTTTTCCA AGAAGAGCCC TTGGTATTTC ATAGATTTGC 11100 

TGGCTTCTAC CACTGCTCCT GCCAGACTTT TTACCGCCAT CAGAGATTGG ACAGGTTCAG 11160 

CCATTTCCAA GTACATGGCG CCGAAAAAGT TCTGCTCCCC TTCTCTTTTT AGGGCAGCAA 11220 

GATAGGTTGG TAACTGAGAA TTGAGCCCAT TAAAGAAATG AGGAAACTGG AACTGAGTCA 11280 

GACTGGATTT GTAGTCTACT ACTCCTATCG CTCCATTAGC TTTCAAACGG TCAATCCGGT 11340 

CCACCTTGCC TCGTACAAAG ACACTGCGTC CATTGTCTAA TTGAATAAAG GCTTGGTCTT 11400 

TTCCACCAAA ATTTGCTTCT TCTTTGATGG TTTCGATGGC TGGATTGTGT CGGAGAATAT 11460 

GTCCAGTTGT CCGTGCAACA TCAAGCAAAA CTTCCTTGGT AAACTGGGCT TCCAAACTTT 11520 

CTTGATAAAT AGCTTCAAAT TCGCGTTCTT GACTGGTTTC TTGAATAGCT TGTTCTAGAC 11580 

GTTGGTCAAA GGAATCTTCA TTAGGCAACT GTAAGGCGCG TTCAAAGATA CGATGCAAGA 11640 

AATTCCCGTG ACTACGGGCA TCAGGATGCA AACGTAATTC CTCCTGCAAG CCTAAAACGT 11700 

AGCGTAGGAA ATAACTGTAT TCATTGCGAT AAAACTCTGT CAAACCCGAC GTAGACAGGT 11760 

AAAACTCCTG TTTGGCAGGA TAGAGAGCTT GCAAGGTGTC CTTGGCTAAG GTCTTGCTGC 11820 

TTGGACTGGT TGGGATAGCT GGATTTTCCA GACCTTGCTG ATCTAGTTTT TTACCTATGA 11880 

CACGCGACAG AACCTTGACA AAAGTCAAAT CTTGCTCAGT ATCGCTCATC TCACCCTGCT 11940 

GGTGATAGGC AACCAGACTA GACAAAAGAC TGTGATAGGA CCCCATATCC TCCTTAGACA 12000 

GTCCTTTGTG ATTCATCCTC TTCTCTCTCC GCCTAAATCC AAAATGGATC AACTCTTGAA 12060 

GATAGGCAGA TTCCTTACTT TCACTTTCGT TAAAAAGGCT TGGAGCCGAC AAGAACAACT 12120 
GCTTACGAGC AGAATTGACC AAGGAAAGCA TAGTGTAGCG ATTTTTCTTG AGATTTTCAC . 12180 

TGCTGGCAAT CAGTAATTGA ACGCCTTCTT CGGTCGCTTG GTTTAGGTTT TGCCTTTCTT 12240 

CATCTGTCAG AAGACTGGTG TTTTGAGAAA TTTTTGGTAA ATTGTCCTGA GTTAGTCCAA 12300 

TAGCATAGAC AAAGTCAGCA GTCAATGGTG CAATCAAATC GTAACTCTGC ACCAGAACAG 12360 

TGTCCACTGT TGCTGGAATG GTACGGTATT GGGACAAACT CATTCCAGAA TGGAGCAAGG 12420 

CTAGGAAGTC TTCCAGACTA ACCTGTGAAC CAGCAAAAAC AGTCGCAAAT TGTTCTAAAA 12480 

CATGGCAGAA AGCCTTCCAA ACTTCGGCTT GTCTTTCCTG TTCTACAGCT TCCAAAGTGG 12540 

TTGTCAAATC TTGTAACTGC TTGGTCACAG CTCCTTCTTT TAGAAAGACA CTCCATTTTT 12600 

GTAGGAGTTT TTCAGCCTTT TGTTTTCGGC TGGCAAAGAG GGTTTCAAGA GGTGCTAAAA 12660 

TTCTCAGGCG GAG6ACATTC AAACGCTCAA GATTAAATTT TCCATGGTGG GATTTGGTGA 12720 

AGGTTTGCTG AAAGGCTGGC AAGCCATTGA TACCAAGATA GCGGATATAT TGCTCAAAAG 12780 
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CATCAATATC AGACTGACTG AGGTCAGTAT ACAAATCAGT TCTAAGAAGA TTAATCAAAT 12840 

CCTCCTGACG AAAACGGTAA CGTTTTAAAG CTAAAATAGA CTCGACAAAC TGAGTCAAGG 12900 

GATGATGAGC CATGGCTTCG CTTCTACCAA GATAAAAAGG AATCTGATAC TGGTCAAAAA 12960 

TGGTTTTGAG AGATAACTGG TAAGAAGCTA CATCCCCCAA GAGAATACGA AAATGCTTGT 13020 

AGCTCAGGTC TGAGTTCTCA TGTAATTTCT GACGAATACT ACGGGCTACT AGCTCCAACT 13080 

CCTCCTTTTG CGTCAAACAA GACCAGATTT GTAAATTTTC ACGGTCTTTC TCATCGACAT 13140 

CCAAAGCGAG TTCTGAAAAG TCATAAGAAG ACTCCAACAA ACGAGAGGCC TTGTCAAAAC 13200 

TATCCATCTT CTCAT6AGTT TGAGAACAGT CCTGAGCAGG CGTTTGGTAT TTAGAAGCCA 13260 

GATGATGGAG AAATTTTACG CTGGCTTGGT AGAGATTGCC CTCGCTAAAA GGACTGGTAT 13320 

AGGCTTTCTT ACTAGCATAA GCCCCGATAA CAATCTCAAC ACCTTTGCCG TGAAGTAAGT 13380 

CCACAACCCG CTCTTCCTCA GCAGAAAAAC GAGTAAAGCC GTCAATGACC AAGGCGATTT 13440 

GATTAAAATC ACTACTTACC TTGTCATTCT CAATAGCCTC AATCAAATGG GACAACTGAC 13500 

TTTCCTGGGC TAACTGACCT TGATTAAGAT AGGCTGTTAC TTTCTCAAAA ATCAAGAGTA 13560 

AATCCGCCCT CTTATCCTCA TCTGTTAAAT TCTCCAAGTC CAAAAAACTC ATCTGAGATT 13620 

TGGTCATCTC ATGGTAAAGC TCAATTAACT GCTGGATCAA TTGAGGATCC TGCTTAATAG 13680 

CGCCATAAAC ACGCAAGTCC TTGGGATCGA GTTCGGCAAG GCATTTGTAA AAGGCCAACC 13740 

CAAGACCGAT ATCATCAAGA GTAGTTTTAG CTGGTAAATC ATTCAAGACC AGATAGCGAG 13800 

CCATTTGAGC AAAGCX3CGTG ACGGTAATCG AAAAAGAAGC CTGCTGGGAC AAGTATTCCA 13860 

6CACGGCGCG TTCCTTTTCA AAAGAAA6AG AGTTGGGGGC AATGTAGAAG ACCCGCTTGC 13920 

CAGCTGCAAC TAGCTCTTCT GCCTCTCTTG TTAGAATTTC TGTCAAAGAA GTCCGAATAT 13980 

CAGTATAAAG TAATTTCATC TCAGCCTCGT TGGAATTTTT CATCACCCTA TATTATACCA 14040 

TGATTAGCCT CGTAAATCTG TTAAAATATT TAGGCCATCC TTTCTTTTCT TCATCATCTG 14100 

CTAAATCTTA AATACTTAGC TTTACTTGTA TTAGATAGAA TAAGTCTGGC TACTGAAAiVT 14160 

CACATAATAA AAAAGCCTCG GTAACAAGGC TTTGAGTTTT ATGATTGTTT CTTAGGTACG 14220 

GAATACACTT CAATGTGTTG TCCCAGTATC TTAATGTCGA CTGGTAGATT GTCTGATTTA 14280 

TCGCCATCAA CATCGGACTC TAATTCGATA TCAGAAGAAG TTTTAATATT ACGTGCCTTT 14340 

ATATATTCAA TATTCTTGAT AGAATGATTG AACTATAGTA AATTGAAACT ATAATAGTAC 14400 

ACCGTGGATG CTAAAATATT TCTAGAAATT AATTTGATTT CCCTAATCAA GCTATTCGTA 14460 

TCTTATTTCA ATCTACTATA ATAAAATGAA CCAAAAATAG TACACAATGT GGTATAATCT 14520 

TCTTATGGCA TATTCAATAG ATTTTCGTAA AAAAGTTCTC TCTTATTGTG AGCGAACAGG 14580 
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TAGTATAACA GAAGCATCAC ACGTPTTCCA AATCTCACGT AATACCATTT ATGGCTGGTT 14640 

AAAGCTAAAA GAGAAAACAG GAGAGCTAAA CCACCAAGTA AAAGGAACAA AACCAAGAAA 14700 

AGTTGATAGA GATAGACTTA AAAACTATCT TACTGACAAT CCAGATGCTT ATTTGACTGA 14760 

AATAGCTTCT GACTTTGGCT GTCATCCAAC TACCATCCAC TATGCGCTCA AAGCTATGGG 14820 

CTACACTCGA AAAAAAGAAC CACACCTACT ATGAACAAGA CCCAGAAAAA GTAGCCTTAT 14880 

TTCTTAAGAA TTTTAATAGT TTAAAGCACC TAGCACCTGT TTAGATTGAC GAAACAGGAT 14940 

TCGATACTTA TTTTTATCGA GAATATGGTC GCTCATTAAA AGGTCAGTTA ATAAGAGGCA 15000 

AAGTATCTGG AAGAAGATAT CAGAGGATTT CTTTGGTTGC AGGTCTAACA AATGGTGAAT 15060 

TAATCGCTCC AATGACTTAC GAAGAGACGA TGACGAGCGA CTTTTTTGAA GCTTGGTTTC 15120 

AGAAGTTTCT CTTACCAACA TTAACCACAC CATCGGTTAT TATAGTAAAA TGAAATAAGA 15180 

ATAGGGGGGG GGGGG6AGGG GGGGGGAGGG AGA 15213 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6004 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



TTATTACCTG 


AAACATTAAA TTTAATTGGA 


CATCCCGTTA 


TCAATTTTAT 


AATATCATCA 


60 


AGATTTTTAT 


TATCTGATTC AGGAATTTTA 


TCTGATATAA 


CAACACCATT 


TTCAAGATAG 


120 


TTCATTAAAT 


TATTTGATTC ACTAACATTA 


GTGTTTTGAT 


CTCCATCAAG 


CCAAAAATAA 


180 


TGGTTATCGG 


AATCTAAATA CGATGAGTTT 


AAAATATTAT 


TACAAATTAT 


TTGATTTGCT 


240 


CCACCAGGAA 


TATATCTCAC TACTAAATTC 


TGTTTAAGAT 


TCTCACTACC 


TGAATGAGTG 


300 


ATAACAAACT 


CTAGAATATA TTTAGCTAGT 


CTATCTTCAA 


CATAAATCAT 


CTTCCTAGAA 


360 


TGATACACAT 


CACCTAATTC AAAAAATGCA 


TOCTGATAAT 


CAATATTTTC 


AATAACATCT 


420 


ACCTTTTCTC 


CGTTTTTCAC TAAAAGTTTC 


ACGGCTTCTC 


TAGGAAAATC 


TTTTATAAGT 


480 


TGTGTAGAAT 


GTGTAGTGAT AATAATTTGA 


TGTPTTTTTAT 


TTAAACACTC 


TTGAAGTAAA 


540 


AACTCTTTAA 


ATTTATAGAT TGCACTCGGA 


TGAAGTGAGA 


TTTCAGGTTC 


ATCTATTAAT 


600 


ATTAATGAAT 


TTGATTGCGC ATTTACTATA 


TCATTTACTA 


ACAAAATAAT 


TCTAGCCTCA 


660 


CCTGTTCCTG 


CAAAAGCCTC GGAATATTCT 


TTTCCAGATT 


TTTTCATCCA AATAGTTTTG 


720 
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GAAGCTTTTA TATCATCACC TTTTGAATAC AACTTATGTG TTAAAATTTG AATGTCTGTA 780 

TAAGATTCAT CCATTATTTC ACTAATAATT TCACAAACTT TATCATCAAC TTTAACATTA 840 

TCTATAACCA TTTCCTTTTT ATAACGCGTA TAGCTACTOXS TATTATTCTT TAAAATATCA 900 

GCAACTGGCT TAGATCGTAA TCTTATAAAA TCTTGTTTAC TACGTTGAGT AGAAATTTTT 960 

TTAAAATTAT AGTGATAGAA AAATAAATCA AAAGCAGAAA CATATTCTTT ACAATCACAA 1020 

AAGACAACAT TTTTTTCAAT GCCATCCCAT CTGTCTGTCG AAGAACTTCC AATATATTTA 108 0 

TTTTTGGGTA ATCTTTCCAT CTCATATTGT TTTTGAGGAG CATATGGTTC CCAATAATCT 1140 

AATCCTTTTT TTGTTCCAGA ACGGCCTTTA AGAACTTCTA CATTTCTAGA AGCTTTAATG 1200 

TTATAATATG AATAGATTAA ACATPGTTTC CCATCCACTT CATCTATTTG ATCAACATTT 1260 

GTACTAAACC AATATTCAGA CACACTTTTA TTGGCTGGAG AACCATATAA AGCTTGTAAA 1320 

ATTGAAGTTT TATTTACTCC ATATCTATTA CAGACACCTC AGGATTATTT AACTTATAA6 1380 

TTTTAACAGC TACGGAATCA ATTTCAACA6 CAACTTGAAC ATCTATGCCT GATTTTTTAA 1440 

GGCCACTTGT AGTGCCACCT GCACCGTTAA ATAAATCAAT AGCAACAATT TTCCCCATAG 1500 

TATTCTCCTA AAGTTTCTCC TTTTTATTAT AACATTATCA AATGTAAAAC CCAACCCGAT 1560 

AGGGTTAGGT TTTTAACATC ATTTCACCAA CTTCTTCATC TCATCAATAC GTGCGACGGT 1520 

CGCGTCATAT TTAGCTTGGT AGTCAGCTTG TTTGTCGCAT TCTTTTTGGA CGACTTCTGG 1680 

TTTGGCGTTG GCTACGAAGC GTTCGTTAGA GAGTTTCTTA CCAACCATGT CCAGTTCTTT 1740 

TTGCCATTTA GCAAGTTCCT TGTCGAGACG GGCCAGTTCT TCTTCAACAT TGAGGAGATC 1800 

GGCCAGTGGC AGGTAGATTT CTGCTCCTGT GATGACACTT GACATAGCCA GTTCAGGTGC 1860 

AGGGATGGTT GATGCGATTT CCAAGTGTTC TGGATTTGTA AAGCGTTTGA TATAGTTGAC 1920 

ATTGCTGTTA AAGAAGGCTT CCAAGTC6CT ATCGCTTGTC TTAACAAGGA TGGTGATAGG 1980 

CTTGCTTGGT GCTACATTTA CTTCCGCACG CGCATTCCGA ACAGCACGAA TCAAGTCTTT 2040 

GAGACTTTCC ACACCAGTGT GAGCCGCAAG GTCTTCAAAG GCTAGATTAA CAGTTGGGTA 2100 

TGCAGCTGTC ACGATAGAAC CTTCTGAGAT TTGTCCAAAG ATTTCCTCTG TCACGAATGG 2160 

CATGATTGGG TGAAGGAGAC GAA6GATCTT GTCCAGCGTA TAGAGGAGAA CAGATCGAGT 2220 

AATGACCTTA TCGTCTTCAT TGTCGCTGTA TAGAACTTCC TTGGTCAACT CAACATACCA 2280 

GTTGGCAAAT TCTTCCCAGA TGAAGTTGTA AAGGATATGA CCAGCCACAC CAAACTCXyVA 2340 

CTTATCAAAG TTTTCAGTAA CTTTTGCAAT GGTTTCGTTG AGATTGTGGA GAATCCAGCG 2400 

GTCCGTCACA TTACCAGCCT CACCTGTTGC AACTTTTGTG ACATTGTCAT GCGCCACATC 2460 

CAGCGTCAAA CCTTCATTGT TCATGAGGAT ATAGCGAGAA ATGTTCCAAA TTTTGTTAAT 2520 
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AAAGTTCCAT GAAGCATCCA TTTTCTCGTA AGAGAAACGA ACGTCTTGAC CTGGTGCGGA 2580 

ACCGTTTGAA A06AACCAAC GAAGGGCATC AGCACCGTAT TTCTCGATGA CATCCATTGG 2640 

GTCAATCCCG TTACCGAGAG ATTTAGACAT CTTGCGTCCT TGCTCGTCAC GGATGAGACC 2700 

GTGGATAAGC ACGTTTTGGA ATGGCT6ACG ACCAGTAAAT TCCAAGGACT GGAAGATCAT 2760 

ACGAGACACC CAGAAGAAGA TGATGTCGTA ACCTGTTACC AAGGTTGAAG TTGGGAAATA 2820 

ACGTTTAAAG TCTTCTGAGT CGACTTCAGG CCAGCCCATG GTTGAAAATG GCCAGAGGGC 2880 

AGAACTGAAC CAAGTATCCA AGACGTCTTC GTCCTGAGTC CATCCGTCAC CTTCTGGAGC 2940 

TTCTTCGCCG ACATACATTT CACCATCAGC ATTGTACCAG GCAGGGATTT GGTGACCCCA 3000 

CCAAAGCTGA CGA6AGATAA CCCAGTCGT6 GACATTTTCC ATCCATTGAA GGAAGGTATC 3060 

GTTGAAACGA GGTGGGTAGA ATTCGACCTT GTCCTCTGTG TCTTGGTTAG CAATGGCGTT 3120 

CTTAGCCAAT TGGTCCATCT TGACGAACCA TTGAGTAGAC TWVGCGTGGCT CAACTACGAC 3180 

ACCTGTACGT TCTGAGTGAC CAACACTGTG GACACGTTTT TCGATTTTGA CAAGGGCACC 3240 

GATTTCTTCC AACTTAGCAA CGACTGCCTT ACGAGCTTCA AAACGATCCA TGCCTGAAAA 3300 

TTCAAAGGCA AGCTCATTCA TAGTTCCGTC GTCGTTCATG ACGTTGACTT GTGGCAAGTT 3360 

ATGACGTTGG CCAACCAAGA AGTCATTTGG ATCGTGGGCA GGTGTGATTT TCACGACACC 3420 

AGTACCAAGC TCAGGATCTG CGTGCTCATC TCCAACGA'IT GGGATGAGTT TATTAGCGAT 3480 

TGGAAGGATG ACGTTTTTAC CAATCAAGTC CTTGTAGCGC GGGTCTTCTG GATTAACCGC 3540 

AACCGCAACG TCCCCAAACA TAGTCTCAGG ACGAGTTGTA GCAACTTCAA GGGCGCGTGA 3600 

ACCATCTTCC AGCATGTAAT TCATGTGGTA 6AAGGCACCT TCTACATCCT TGTGAATCAC 3660 

CTCAATATCA GAAAGGGCTG TGCGAGCTGC TGGGTCCCAG TTGATGATAA ACTCACCACG 3720 

ATAGATCCAG CCTTTCTTGT AAAGGTTCAC AAAGACCTTA CGAACAGCTT TTGACAAACC 3780 

TTCATCAAGA GTGAAACGCT CACGAGAATA GTCTACAGAA AGCCCCATCT TGCCCCATTG 3840 

TTCCTTGATG GTAGTGGCAT ATTCGTCTTT CCATTCCCAG ACCTTCGTCA AGAAAGACTC 3900 

ACGACCTAGG TCATAACGCG TAATACCCTC ACCACGTAAG CGCTCCTCAA CCTTAGCCTG 3960 

AGTCGCAATA CCAGCGTGGT CCATACCTGG AAGCCAAAGG GTATCAAAGC CTTGCATGCG 4020 

TTTTTGACGG ATGATGATAT CCTGCAAAGT CGTATCCCAA GCGTGACCAA GGTGAAGTTT 4080 

CCCAGTTACG TTTGGTGGTG GAATCACGAT TGAATAAGGC TTAGCCTTTT GATC6CCTGA 4140 

AGGCTTGAAA ACATCCGCAT CAAGCCATTT TT6GTAACGA CCA6CCTCAA CCTCGGCTGG 4200 

ATTGTATTTA GGTGAAAGTT CTTTAGACAT GTGTGTGTCC TTTCTCTATT TTGTTTATTT 4260 
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TATTTTGAAT TTGCTTAGCA GCTTCTTCTG CAGACAAATT CGTATTATTT ATTTTAAAGT 4320 

AGTGGTGCAA CTCATTCGGT TGATGTTGGG AATTTAATTG AAGTGTTTCA GCGGTCTCTA 4380 

AAATTTCTCT TTCAGATACC TCAATATGTC GTTTTAAGGG TTTGTGCTTT AATCGATTCT 4440 

CCGTTCGATT TCGACGTATG CACTCTTCAA GACTTGTTTC CAATTCAACA AACAGAATCT 4500 

CTTGATGAAA GTTATCCAAT AAATCCTGAA TTTGCTTTAA ATACATCAGC TGGTACTGAT 4560 

TTGAAAAATC AATTACGTCT GTTAAAATTA CTGATCGCTG ATTTCTTGCA CTTGCTCCAA 4620 

GGAAAGAAAA GGTAATTCCA CGAACAAATT CCCACATCTC CTCGGTATAA TCCTGATAGA 4680 

TCTCTAGTGC AAAATCAATG GCTTGATGGT TATAAAATAG GGTAGCATCC GTCAGTCGAG 4740 

ATAATTCTTG ACCAATGGTC ATTTTTCCTG ATGCTGGAGC ACCAATGATG AAAAGATGCA 4800 

TCAAATCACC TCCCACTCAC TCCTCAGCAA GCCATATCTC AAATCATCAC AGCAGTTGCC 4660 

TTGAGCATCT TTGCGGTCTC TTATGCGAGC TTCGAGGGTA AAGCCAAGCT TTTCCGAGAC 4920 

TCGTTGACTT TGAAGGTTAT ATCCAAAGCA AGTTAGTTCA ATCTTGTGAA GACCAAGTTC 4980 

TTTAAAAGCT AGATCAATCA AGGAACACGC TGCTTCTGGA ACATAACCTC GACCCCAATA 5040 

GTCTGGGTGC AAGGTATAGC CAAGCTCTAG CACATCATCC GCATGAAGAT GGTTGAAGTC 5100 

AACAGAACCA ATGACTTTAT CGGTTCCTTT GACGACAATC CCATAGCCAG CTGGGAGATT 5160 

TTCCTTTTGA GTACGCTCCG GAAGAATGTG CTCCAGATAA TAAATCTCAT CTTCCAAGAT 5220 

CTTGACTGGA 66AAAACCTG CTGGATAGGC GACCTCTGGC AAACTAGCGT AGGTATGGAT 5280 

ATCCTCAGCA TCCACCACTG TGCGGACTCG TAAAACGAGA CGTTCTGTTT CGATTTTATC 5340 

TGGCAGCTCA GTTCTTGCCA TCCTTCTTCC TCGCTTTTTT GATGAAACTG CCCTTCATAT 5400 

CTACACGCTT GTCCAGATAG CGATAAACGC GCTGATATCC ATCTCCCATG AAATAGGTTG 5460 

GGGCAAACAG TTGATTTTTA AAATGTCCCT TTTCATCCAG 6AGTTCTGGG GCAACAAGTC 5520 

GCTCAAGAAT CTTGGCAAAG ATGTGGCAAA TACCGTCTl'C CTCAACAATC CTATCTACCC 5580 

GACAATCTAA AACAAGTGGA CAGGCGTCTA AAATAGGAGT CTGAGTTCGT TCAGAAATPT 5640 

CATAATGCAC TCCCAAACGT TCCAATTTCT CCTGATGACT GATAAAACCA GCCTGCTCCA 5700 

TCGCAAGCAT AGAAGTTTCA TCAGAAATAT TCACAGTAAA TTTTTGATAC TGTTTGATCT 5760 

GCTCTGCGGC ATTCTCTCTC GCAACGACTC CAATCACAAC CCAATCTCCT AGACTATAAG 5820 

AGGAACTACA GGTCGTGATG TTATAGCCAA AATTCTAATC TTGATATCCT AAAATAAAAA 5880 

CAGGAAAACC ATAATATAGT TTACTTGTGT TAAAAGATTG CTTCATAACA ACCCCCTTTG 5940 

ACTAAGACGT AAAAGAAAAG CCCTGCCATC TACATGACAG GGACGAATGT GTTTATCCGC 6000 

GGGG 6004 
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(2) INFORMATION FOR SEQ ID NO: 28: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5857 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

TGTAGAATTC ACGACAATGC TTCGTTGATT TCTGGGTTGA TTTCGTCGCG TTCTGGCAAG 60 

CGAGTCAATG AACCAAAAAT AGTACACAAT GTGGTATAAT CCTTTTATGG CATATTCAAT 120 

AGATTTTCGT AAAAAAGTTC TCTCTTATTG TGAGCGAACA GGTAGTATAA CAGAAGCATC 180 

ACACGTTTTC CAAATCTCAC GTAATACCAT TTATGGCTGG TTAAAGCTAA AAGAGAAAAC 240 

AGGAGAGCTA AACCACCAAG TAAAAGGAAC AAAACCAAGA AAAGTTGATA GAGATAGACT 300 

TAAAAACTAT CTTACTGACA ATCCAGATGC TTATTTGACT GAAATAGCTT CTGACTTTGG 360 

CTGTCATCCA ACTACCATCC ACTATGCGCT CAAAGCTATG GGCTACACTC GAAAAAAGAA 420 

CCACACCTAC TATGAACAAG ACCCAGAAAA AGTAGCCTTA TTTCTTAAGA ATTTTAATAG 480 

TTTAAAGCAC CTAACACCTG TTTAGATTGA CGAAACAGGA TTCGATACTT ATTTTTATCG 540 

AGAATATGGT CGCTCATTAA AAGGTCAGTT AATAAGAGGC AAAGTATCTG GAAGAAGATA 600 

TCAGAGGATT TCTTTGGTTG CAGGTCTAAC AAATGGTGAG TTAATCGCTC CAATGACTTA 660 

CGAAGAGACG ATGACGAGCG ACTT TT TTGA AGCTTGGTTT CAGAAGTTTC TCTTACCAAC 720 

ATTAACCACA CCATCGGTTA TTATTATGGA TAATGCAAGA TTCCATAGAA TGGGGAAGCT 780 

AGAACTCTTG TGTGAAGAGT TTGGGTATAA ACTTTTACCT CTTCCTCCCT ACTCACCTGA 840 

GTACAATCCT ATTGAGAAAA CATGGGCTCA TATCAAAAAG CACCTCAAAA AGGTATTACC 900 

AAGTTGCAAT ACCTTTTATG AGGCTTTTTT GTCTTGTTCT TGTTTCAATT GACTATATAA 960 

ATTGTCTAAG CGAAACAACC GATAAGAATT GGCACAAAAG CGACCGTATT TTTGTTACCA 1020 

ATACAGGAAA AACAGTTCAT AGTTCTATCT TGAGCAAGTC TCTCCAGCGA GCAAACGAAC 1080 

GCCTTAAAAA ACCAATTCCC AAACATCTGT CCCCTCACAT CTTCAGACAC ACCACTATTA 1140 

GCATCTTATC AGAAAATAAA ATTCCTTTAA AAACAATCAC GGACAGGGTT GGTCATCCCG 1200 

ACTCTGAAGT CACTACTTCC ATCTACACCC ACGTCACAAA GAACAT6AAA GATGAAGCAA 1260 

TCAATGTACT GGATAAAGTT ATGAAAAAGA TTTTTTAAAA AGTTTTGTCC CTTTTTTGCC 1320 

CTCTAAATAC AAAAATAGCC CTTCGGATAA AATCCGAGGG GCTAGAAAC6 TTGTTAAATC 1380 
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AACGGCCGAA CTTTTGAATT TCATGGTTCG GGATAAAATA GTTCACTGAA CTATTTTATT 1440 

TTTTAAGGTT ATCATAATAT CAAATAGTTC AATTAAATAC GCTAAATTAC TAATATACTT 1500 

TTTACCTTTT TCATTCTAAA ATGTAAAGTA CAAACAATTA CAATATACTA GAGGGGGAGT 1560 

AAAAAAGGTA TTAAATCGAT GAGTTCAGCA GGCAAGAAAA TAGCACCTTT ACGGGTGCTA 1620 

TTTTTTAATT AACGCCACGT TAACTTTTGA TTGATGAATT TTATTGTTTG GCACTTCTTT 1680 

CATTTCACGG TAAACATCGA TGAAATTCTT TCCAACATTA TTTTTGGAGT TAACTGCATT 1740 

TATTTTTGTA TTAATAACTT TTTTAGTATC GAAAGAATGG TTTAAGAAAT CCATAACTAA 1800 

CTCTCCTTTC TCATCCTGTA ATCAAGATTT TTATCAATGT CAAAATAGTA TTTTCTATCA 1860 

ATCCAAATTG GTCCTTCTCC TTTAGAAATA GCAAGTACAT CTACCGGACC TCCTACTGTT 1920 

TCAAGAGTGT TGACAATTTT TCTCTTAAAT GAAGTTAATT CAATAAATGT TTTAGCTGTA 1980 

CTCGCCATTT CATTAAGTGG TTGCATTCCA ATAAGGTCTA rTATAQGATT TATATAATAT 2040 

TTTTGCTGTA TAGATGATAT ATTTTCAAAT ATATTCTCAA TTTCATCACC CAATCCATTT 2100 

TTCTCCATAA CTGATGATAC TTGCTCTGCG ATATATACAT TTAAGTTAGG ATCTATACCA 2160 

TTCATAATCG TCTCAACCAT CTCTGACTGT GCAAAAGGGA TTATATGACA AGTTTTATGA 2220 

TGATTTATCA CACTTTCATT AATAACTTTC CAAATTAATC GTTTAGAAAA AATTCCATAT 2280 

AATTCAATTT GTCTTATAGA TGGAAATATC TCGTCTGTAC CATAACCTGC TATAACTAAT 2340 

CCAGTTATGT TTGTTGAGTC ATATCCAATG AAAATCGCTT TATATAAAGA TTTAGCAATA 2400 

ACTTCAACCT CATCATCAGT ATGAGGAAAG GATTTAAAAA CATCGTCTAC AATGCTTTTT 2460 

ATTAACTCTA ACTCAGCTTC AAAAAATTCA AAATTACTTT CAGCTTCTAC TTTTGAAATT 2520 

TCTAAACTAA AATTAGTTAT AGCATTTAAT AAAATTTTAT TAAAATCATC TAGAGTGATG 2580 

GTTTCACCAT TA6AAACTCT TAAATCAGCT GTTTCTTGCG CTTCATAGGC AATGCTGTCC 2640 

AAAATACTTC TTGTACTTCT GACAATATAA TTTCTTAATA AATCCTCAAC TTGTAGATGT 2700 

TTAAAGGAAA TTAAAAATTC .TATTAGCTTT TCAACGTATT GGGCAGTATT ATCTAATAAA 2760 

TCTGTGCCAA TAGCCTGCTT AAACTCATTT AAAATTACCT CCCACGGAAT TTCCATAAAC 2820 

GAAGCGTTCC CATATATCAT GATCCCCACG GAATGTTCTT TTGATAAAGT GAATAATTTT 2880 

CGGGCGCTAT TAAAAACTTT TGAATTTTTC CCGTCTGATA AGGTTACAGC 6CTATCAGAA 2940 

GCCAATACAA CACCATTTTT ATTTAATATT CCAATTTCTG CTGTCAAAAT ATCACCTAAA 3000 

CTTTCTAAAC CTGCTCATGC TCTAATGGTA CAACAGCTAA GGTCTTACCA AGACTTGCCA 3060 

ACACTTTTAA TACTGTATCA AGTTGTGGGC TTGTCTTTCC TGTTTCCATT CTAGCGATAA 3120 

CTGGCTGACT AACACCGCTC ATCTCCTCTA GTTTCTTCTG ACTAATACCC TTTTCATTTC 3180 
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TAGCCTCGAT AAGCTCACTC ATGATAGCCA CGCGCATATC ACTTTCCAAA ATTTCCTCTT 3240 

TGCTGAATAA TTCAGCTCTT ACATCTTTCC AGTTACTACC AATAGCATTA TTTTTCATTG 3300 

TCTAAACCTC TTTCTTTTAA ATCTGCAAGT TCACGTTTAG CTTGCTCAAT CTCTCTTTTG 3360 

GGTGTTTTCT GTGTCCTTTT CATAAAATGA TGCAGTAAAA CAAAACTACC ATCCATCCAA 3420 

GCAACAAATA AAATTCTATC TCTAAGTGGT CTCAGCTCCC AAATTTCAGC ATCTAAATGC 3480 

TTAATATATG GTTCGCCTGC GCGTGTTCCA TGTTGGCTTA ACAACTCAAT ATAATCATTA 3540 

ATTTTATTAA GCTTAATTCT GCTATCTTTC CCTTTTTTAC TGGTAAGCTC TCGCATATAA 3600 

TCAAAAACAG GCTCATTGCC GTTTTTATCC TTGTAAAAAT AGATATTATG CACTATTAAC 3 660 

ACCTCTTCCT AATAACAATT ATAACCTAAA AGTTATTGTT TGTAAATACT TTTAAGTTAT 3720 

TAAAATAAAA AGCACCTAGT TTCCTAGATG CTAGCACAAT GACACGGATT CGCACCGTGG 3780 

CTACCTCTAT CAAGGTGTAC TCCTTCTATA CTATCCCTTG TGCTTTAGAA TATTATACCA 3840 

CACAATCAAC TAQATACCTA CCATCTCATG ATATACCCCC ATTTTGGGCA AGGGTACAAC 3900 

GCTAAAATAC AAATCAGAAT AGATATTAAA CCACTTATTT AACTTATCAT AAGCTGGTGA 3960 

TTGACTGATA AATAATATCC GCTGACAAGC TCCGATAACA TTCATGTGAT TGTACACATA 4020 

AACCTCTTTT ACAGCCTCTA AAATGTCAGC CTCACTTGTT TGTACXXTAA TATCTGTTAT 4080 

CTGCTTGATA GTTGCGTATT TTTGATAAGC TAGCATATCT TGATTTTTAG CAGCATCAAA 4140 

CATTTTACGC TCAAGGACAC TATACTTAGG TTGTTCTTTA TCTCGCATGA AATACCACTT 4 200 

GAGCCATAAA ATCTTTTCTC GGTGTATTAC AGAAATACGC TCAATTTTCT TCTTTGTCAT 4260 

TGCTACCTCC TAAATCATCA ATTTAACAAT TCTAACCACT CACTTTTAGA AATAGTTGCA 4320 

TAGATCTTGT TCGATGTATG ATACAAAGGT TCTAAATCTT TTTCCACCCT AATATAGTTC 4380 

ATCTTATCCT CATGAGTAGG AAAGTATAGT ATTTCCGTTT CATCCTCGTT TAGGATACGA 4440 

TTGCACCAAT CATCAATAAT AACTGGCACT TCCCACTCAC GCCATTTTTT AAGGTTTTCT 4i500 

AAAAGTTCAT TATCACTAAA TAGCTCGCCA TCTATTTGGA AAAATTCCCC TAAGTCATTG 4560 

TTTCCTTCAA CAATAATAAA CTCTGGCATA TTTCTATTAC TTAATAACTC CTTGAGTTCT 4620 

TGTAACTCTT TGATTTCCTT TAGATACTTC CTCAATTTCC AACCTCAATT CTTCAATCTG 4 680 

CCTTACTACT CCAAAAATTT CATGGGTCTT ATAAGATTGT TCAAGTATAG CCTTTGCTGC 4740 

TTGAGTTCTT ATAAACGGGT TGACCTTACT GTCCATCATA ATATCATTGA GTACAGAAAC 4800 

AGCGTTAGAT 6ATGCTAAAT AAAGCATTTG AGTTGTTTTA TCCATCATCT CATCTTGCTT 4860 

TATCCTCAAT GTCTTTTTAA CCGCPGCAAC TTTTAGATAC TTATGACCTG TTGCGCGTGA 4920 
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TACCCCTGCT TTTTGACATG CTTTGTCTAT CGTTGGCTCG GTAAGCATGG CATCTATGAA 4980 

TTTAATTTGC TTGGACGTAA GGTTATCATT TTCATTTCCT GCCATCTATT ACCTCCTCAT 5040 

TATCAAAATA AAGGGTTGCC CCTTTATTTC CCTATGCTAG ATAATTCTGC AATTCTGCAT 5100 

CCATTGCCTC TGAATTGCCC TCAACAATCA TTTCATGCTG TACTAAATCA ATCTTATCTC 5160 

CGTTAATAAG TAAACCACCG TGGAAATAAT CAATTTTTCT ATCAAGGAAA TGTACTAGCT 5220 

TTTCAAGGCG TTGCTGTTGG CTGAATTGCT CCATGTCAAT TTCGATATAA GCAAGGGTAG 52 BO 

TATCATTATC CATAATATCT TCTAATTTTC TAAGAGCTAG AGGTTTATTT TTATATTTTT 5340 

CTAGGTATTC TCTCATTTCT GCCACTGTTA ATTTGATACT AGATAATAAA CTTAGTTCAG 5400 

CTGCATCATC TGCTGTAATA GGCTCTTCTT TTGATTCATG GTTTGCTAGT TCAGCATTTT 5460 

TCTCTTTTTC TAGTTGCTGA TACAATAGCT GAGCAGTATT TTGGGAATAG TTTTCGCCCT 5520 

CTTTTTTATA TTTTAAAAGT TCTTGCTCTG CATACACTTT CCCGATAATC ACTTCCTTAT • 5580 

AAACTAATTG CCCATCTTGA GCTTTTAGCT TAATACTCCC ATGCTCTGGA ATTTCAATAT 5640 

ACTTAATTAT ACCATTTTTT GAGTATAAAA CAAAGCCTTT CTCCATCATT TTTAATAATT 5700 

TATCATCCTT GTTTTCAGTC ATGCTTTTCT CCTTTATTTC ATTTTATTAT AATCTGAATA 5760 

CCCCTAGTCT ATTTATTTCA CTAGGTTTTT AGGGTTCGTA TGCTAAAATA CTACCCTTTT 5820 

TGTGTACCTT ATGGCTGACT TTTCAAATTG GTTAGTT 5857 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10254 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

AAAATGATAG CAGGAGAGTT TTCCCGTCCA TCAGACCCAG AACTGAGAGC CTTAGCTCAG 60 

GCTTCTCGCC AAAAACAGGC CGCCTTTAAC AAGGAAGAGA ACCCCTTGAA GGGAGCCGAA 120 

ATCATCAAGA CTTGGTTTGC CTCAACCGGG AAAAATCTTT ACATCAACAC TCGCTTGATG 180 

GTGGACTACG GTGTCAACAT CCATCTAGGG GAAAATTTTT ATTCTAATTG GAACTTGACC 240 

ATGCTGGATA TCTGTCCCAT TCGTATCGGG GACAATGCTA TGATTGGTCC TAATTGTCAG 300 

TTTTTGACAC CCCTCCATCC ACTAGATCCA CAGGAACGCA ATTCAGGTAT CGAGTACGGA 360 

AAGCCTATCA CAATCGGAGA TAATTTCTGG ACTGGTGGTG GCGTCATTGT CCTTCCTGGA 420 

GTGACACTGG GAAATAATGT CGTTGCAGGA GCAGGGGCAG TAATTACCAA ATCTTTTGGC 480 
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C5ACAACGTTG TCCTAGCTGG CAATCCTGCG 


CGCGTGATTA AGGAAATACC 


TGTTAAATAG 


540 


AAGTAAAAAG GAACAGCTGG 


GGTTGTTTCT 


TTTTTGTAGG 


TTTCATCATT 


TTTTACCCAG 


600 


TTCACATTTA CCTACTCTAT 


CTCTTAGCAA 


GTCTGTTTCA 


TTAAGCAAGT 


TCAAAGCATC 


660 


TCGTAAGTGG GATGTTTTTC TCCTCAGTTC ATCAGCTTCC 


TCCTTGACAC 


TCGGTCAGAT 


720 


TTTGATACAA TAGTACAAAA 


TTAGAGGAGG 


CAGGCTATGA 


TTCAGAAACA 


TGCGATTCCT 


780 


ATTTTAGAGT TTGATGACAA 


TCCTCAGGCG 


GTTATCATGC 


CCAATCACGA 


GGGGCTGGAC 


840 


TTGCAGTTGC CAAAGAAGTG 


TGTTTATGCA 


TTTTTAGGTG 


AGGAGATTGA 


CCGCTATGCG 


900 


AGGGAAGTAG GGGCGAACTG TGTTGGCGAA 


TTTGTTTCTG 


CCACCAAGAC 


CTATCCAGTT 


960 


TATGTCGTGA ACTACAAGGA CGAGGAGGTC 


TGTCTGGCTC 


AGGCTCCTGT 


TGGCTCCGCT 


1020 


CCAGCAGCCC AGTTTATGGA 


TTGGTTGATT 


GGCTATGGTG 


TGGAGCAGAT 


TATCTCTACT 


1080 


GGGACCTGTG GTGTCCTAGC 


TGATATAGAG 


GAAAATGCCT 


TTCTAGTCCC 


TGTTCGCGCT 


1140 


CTGCGAGATG AA6GAGCCAG 


TTACCACTAT 


GTGGCACCTT 


GTCGTTATAT 


GGAAATGCAG 


1200 


CCAGAGGCTA TTGCTGCTAT 


TGAGGAAGTT 


TTGGAAGACA 


GAGGGATTCC 


TTATGAAGAA 


1260 


GTCATGACCT GGACGACAGA 


CGGTTTTTAC 


CGAGAAACGG 


CTGAAAAGGT 


GGCTTATCGT 


1320 


AAGGAAGAAG GCTGTGCTGT 


TGTGGAGATG 


GAGTGTTCTG 


CTCTTGCX3GC 


AGTAGCTCAA 


1380 


TTGCGTGGGG TTCTCTGGGG 


TGAATTGTTG 


TTCACAGCAG 


ATTCTCTAGC 


GGACTTGGAC 


1440 


CAGTACGACA GTCGTGACTG 


GGGCTCGGAA 


GCTTTTAATA 


AGGCGCTAGA 


ACTGAGTTTA 


1500 


GCAAGTGTTC ACCACCTTTA 


GTTGTACTGG 


CAAAGGATTT 


GTTTTATCAT 


AAAATGTCTA 


1560 


GCTCATACTT TTCAAAAATA 


TGTTTAAACG 


AGGTCACCTT 


CCTCTTGTCC 


TAGGCATGTT 


1620 


GAGGTTGGGA AAAATCTTTA AAATCAGAAA 


AACGTATCAT 


ATCAGGTGAT 


GAAAACTTTG 


. 1680 


ACACTATGCG TTTTATGTCG 


ATAAGATTTA 


GAGTGAGATG 


AAATGATACT 


CTTCGAAAAT 


1740 


CTCTTCAAAC CAGGTCAGCT 


TCACCTTGCC 


GTAGGTATAT 


GTTACTGACT 


TCGTCAGTCT 


1800 


TATCCGGCAA CCTCAAAACG 


GTGTTTTGAG 


CTGACTTCGT 


CAGTTCTATT 


TGCAACCTCA 


1860 


AAACAGTGTT TTGAGCAACC 


TGTGACTAGC 


TTTCTAATCG 


ATGCCTTGGT 


TTTCATTGCC 


1920 


TATAATCAAA AAGAGAAATT 


TTCTCCTGAA 


AAGCATATAG 


AGTAGCTGGC 


GTTAAAAGCT 


1980 


CCTGTCTTGC TTTTTTGACC 


TATAGTCACA 


TCTATCAAGT 


ATTGTTCTTG 


CCTAAGCTAT 


2040 


CAATAAAAAG GTQGCATTTT 


TTAGGCTTGG 


TGTTA6TAGA 


TTTTGCCTTA TCCTATCTAA 


2100 


GTCATTTCGA ACTTTTTATG GTACAATGGA AACATGTTAT TCAAATTATC TAAGGAAAAA 


2160 


ATAGAGCTAG GCTTATCTCG TTTATCGCCA GCCCGTCGTA TTTTTTTGAG TTTTGCCTTG 


2220 
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GTCATTTTAC TAGGCTCTCT TCTTTTGAGC TTGCCCTTTG TCCAAGTTGA AAGCTCACGA 2280 

GCGACTTATT TTGATCATCT TTTCACTGCT GTCTCTGCAG TCTGTGTGAC GGGTCTCTCA 2340 

ACCCTTCCAG TAGCTCACAC CTATAATATC TGGGGTCAAA TAATCTGTTT GCTCTTGATT 2400 

CAGATCGGTG GTCTAGGGCT CATGACCTTT ATTGGGGTTT TCTATATCCA GAGCAAGCAA 2460 

AAGCTTAGTC TTCGTAGCCG TGC7UVCTATT CAGGATAGTT TTAGTTATGG AGAAACTCGA 2520 

TCTTTGAGAA AGTTTGTCTA TTCTATTTTT CTCACGACCT TTTTGGTTGA GAGCTTGGGA 2580 

GCTATTTTGC TTAGTTTTCG CCTTATTCCT CAACTTGGCT GGGGACGTGG TCTTTTTAGT 2640 

TCCATTTTTC TAGCGATCTC AGCCTTCTGT AATGCCGGTT TTGATAATTT AGGGAGCACC 2700 

AGTTTATTTG CTTTTCAGAC OGATTTACTG GTCAATCTGG TGATTGCAGG CTTGATTATT 2760 

ACAG6CGGCC TTGGTTTTAT GGTCTGGTTT GATTTGGCTG GTCATGTAGG AAGAAAGAAA 2820 

AAAGGACGTC TGCACTTTCA TACGAAGCTT GTACTATTAT TGACTATAGG TTTGTTGTTA 2880 

TTTGGAACAG CAACTACTCT CTTTCTTGAG TGGAACAATG CTGGAACGAT TGGCAATCTC 2940 

CCTGTTGCCG ATAAGGTTTT AGTTAGCTTT TTTCAAACAG TGACGATGCG AACAGCTGGC 3000 

TTTTCTACGA TAGATTATAC TCAGGCTCAT CCTGTGACTC TTTTGATTTA TATCTTACAG 3060 

ATGTTTCTAG GTGGGGCACC TGGAGGAACA GCTGGGGGAC TCAAGATTAC GACATTTTTT 3120 

GTCCTCTTGG TCTTTGCACG AAGTGAGCTT CTAGGCTTGC CTCATGCCAA TGTTGCGAGA 3180 

CGAACGATCG CGCCGCGAAC G6TTCAAAAA TCCTTTAGTG TCTTTATTAT CTTTTTGATG 3240 

AGCTTCTTGA TAGGATTGAT TCTGCTAG6G ATAACAGCCA AAGGCAATCC TCCCTTTATC 3300 

CACCTCGTAT TTGAAACCAT TTCAGCTCTT AGTACAGTTG GTGTAACGGC AAATCTGACT 3360 

CCTGACCTTG GGAAATTGGC TCTCAGTGTT ATCATGCCAC TTATGTTTAT GGGACGAATT 3420 

GGTCCCTTGA CCTTGTTTGT TAGCTTGGCA GATTACCATC CAGAAAA6AA AGATATGATT 3480 

CACTATATGA AAGCAGATAT TAGTATTGGT TAAGAAAGGA AAGAGCATGT CAGATCGTAC 3540 

GATTGGAATT TTGGGCTTGG GAATTTTTGG GAGCAGTGTC CTAGCTGCCC TAGCCAAGCA 3600 

GGATATGAAT ATTATCGCTA TTGATGACCA CGCAGAGCGC ATCAATCAGT TTGAGCCAGT 3660 

TTTGGCGCGT GGAGTGATTQ GTGACATCAC AGATGAAGAA TTATTGAGAT CAGCAGGGAT 3720 

TGATACCTGC GATACCGTTG TAGTCGCGAC AGGTGAAAAT CTGGAGTCGA GTGTGCTTGC 3780 

G6TTATGCAC TGTAAGAGTT TG6GGGTACC GACTGTTATT GCTAAGGTCA AAAGTCAGAC 3840 

CGCTAA6AAA GTGCTAGAAA AGATTGGA6C TGACTCGGTT ATCTCGCCAG AGTATGAAAT 3900 

GGGGCAGTCT CTAGCACAGA CCATTCTTTT CCATAATAGT GTTGATGTCT TTCAGTTGGA 3960 

TAAAAATGTG TCTATCGTOG AGATGAAAAT TCCTCAGTCT TGGGCAGGTC AAAGTCTGAG 4020 
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TAAATTAGAC CTCCGTGGCA AATACAATCT GAATATTTTG GGTTTCCGAG AGCA6GAAAA 4080 

TTCCCCATTG GATGTTGAAT TTGGACCAGA TGACCTCTTG AAAGCAGATA CCTATATTTT 4140 

GGCAGTCATC AACAACCAGT ATTTGGATAC CCTAGTAGCA TTGAATTCGT AAAGACGGAT 4200 

GACCCCTCTT TTTTGATGCC TAAGATGGCA AATAGAGACA GAAGCCCCTT GTCTTCTAGT 4260 

AAAAGTTCTT CAAAGGCTGG ACTTTATGGT AAAATAGAAA GAAGTGACAA GAGAGAGTAA 4320 

TACTCAATGA AAATCAAAGA TCAAACTAGG AAACTAGCTA CGGGCTGCTC AAAACACTGT 4 380 

TTTGAGGTTG CAGATAGAAC TGACGAAGTC AGTAACATCT ATACGGCAAG GCGACGTTGA 4440 

CGCGGTTTGA AGAGATTTTC GAAGAGTATA AGAAAAAATC AGTCCCCTAA AGGAGTAGAT 4500 

TATGAAGTTA TTGTCTATCG CAATTTCTAG CTATAATGCA GCAGCCTATC TTCATTACTG 4560 

TGTGGAGTCG CTAGTGATTG GTGGTGAGCA AGTTGGGATT TTGATTATCA ATGACGGGTC 4620 

TCAGGATCAG ACTCAGGAAA TCGCTGAGTG TTTAGCTAGC AAGTATCCTA ATATCGTTAG 4680 

AGCCATCTAT CAGGAAAATA AATGCCATGG CGGTGCG6TC AATCGTGGCT TGGTAGAGGC 4740 

TTCTGGGCGC TATTTTAAAG TAGTTGACAG TGATGACTGG GTGGATCCTC GTGCCTACTT 4800 

GAAAATTCTT GAAACCTTGC AGGAACTTGA GAGCAAAGGT CAAGAGGTGG ATGTCTTTGT 4 860 

GACCAATTTT GTCTATGAAA AGGAAGGGCA GTCTCGTAAG AAGAGTATGA GTTACGATTC 4920 

AGTCTTGCCT GTTCGGCAGA TTTTTGGCTG GGACCAGGTC GGAAATTTCT CCAAAGGCCA 4 980 

GTATACCATG ATGCACTCGC TGATTTATCG GACAGATTTG TTGCGTGCTA GCCAGTTCTA 5040 

ACTGCCTGAA CATACTTTTT ATGTCGATAA TCTCTTTGTC TTTACGCCCC TTCAGCAGGT 5100 

CAAGACCATG TACTATCTGC CTGTCGATTT CTATCGTTAT TTGATTGGGC GTGAGGACCA 5160 

GTCTGTCAAT GAGCAAGTGA TGATTAAGTG CATTGACCAG CAACTCAAGG TCAATCGACT 5220 

CTTGATAGAC CAACTTGATT TGTCCCAAGT GAGTCATCCC AAAATGCGAG AATATCTGCT 5280 

GAATCATATT GAACTCACGA CGGTGATTTC CAGTACCCTG CTCAACCGAT CTGGAACAGC 5340 

GGAGCATCTG GCAAAAAAAC GCCAATTGTG GACCTATATT CAGCAGAAAA ATCCAGAAGT 5400 

CTTTCAGGCT ATTCGTAAGA CCATGTTGAG CCGTTTGACC AAACATTCTG TCTTGCCAGA 5460 

TCGCAAACTG TCCAATGTCG TCTATCAAAT CACCAAATCT GTTTATGGAT TTAATTAATA 5520 

TAAGTGTTTT ATAAGAGGGA TTTAAGAAAA ATTTTAACTT TTTCTTAGTC CTTTTTAATT 5580 

TCAGGAGATT ATACTAGAGT CATCAAATAA AGAAAGACTC TAAGGAGAAT CCTATGAAAT 5640 

TCAATCCAAA TCAAAGATAT ACTCGTTGGT CTATTCGCCG TCTCAGTGTC GGTGTTGCCT 5700 

CAGTTGTTCT GGCTAGTGGC TTCTTTGTCC TAGTTGGTCA GCCAAGTTCT GTACGTGCCG 5760 
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ATGGGCTCAA TCCAACCCCA GGTCAAGTCT TACCTGAAGA GACATCGGGA ACGAAAGAGG 5820 

GTGACTTATC AGAAAAACCA GGAGACACCG TTCTCACTCA AGCGAAACCT GAGGGCGTTA 5680 

CTGGAAATAC GAATTCACTT CCGACACCTA CAGAAAGAAC TGAAGTGAGC GAGGAAACAA 5940 

GCCCTTCTAG TCTGGATACA CTTTTTGAAA AAGATGAAGA AGCTCAAAAA AATCCAGAGC 6000 

TAACAGATGT CTTAAAAGAA ACTGTAGATA CAGCTGATGT GGATGGGACA CAAGCAAGTC 6060 

CAGCAGAAAC TACTCCTGAA CAAGTAAAAG GTGGAGTGAA AGAAAATACA AAAGACAGCA 6120 

TCGATGTTCC TGCTGCTTAT CTTGAAAAAG CTGAAGGGAA AGGTCCTTTC ACTGCCGGTG 6180 

TAAACCAAGT AATTCCTTAT GAACTATTCG CTGGTGATGG TATGTTAACT CGTCTATTAC 6240 

TAAAAGCTTC GGATAATGCT CCTTGGTCTG ACAATGGTAC TGCTAAAAAT CCTGCTTTAC 6300 

CTCCTCTTGA AGGATTAACA AAAGGGAAAT ACTTCTATGA AGTAGACTTA AATGGCAATA 6360 

CTGTTGGTAA ACAAGGTCAA GCTTTAATTG ATCAACTTCG CGCTAATGGT ACTCAAACTT 6420 

ATAAAGCTAC T6TTAAAGTT TACGGAAATA AAGACGGTAA AGCTGACTTG ACTAATCTAG 6480 

TTGCTACTAA AAATGTAGAC ATCAACATCA ATGGATTAGT TGCTAAAGAA ACAGTTCAAA 6540 

AAGCCGTTGC AGACAACGTT AAAGACAGTA TCGATGTTCC AGCAGCCTAC CTAGAAAAAG 6600 

CCAAGGGTGA AGGTCCATTC ACAGCAGGTG TCAACCATGT GATTCCATAC GAACTCTTCG 6660 

CAGGTGATGG CATGTTGACT CGTCTCTTGC TCAAGGCATC TGACAAGGCA CCATGGTCAG 6720 

ATAACGGCGA CGCTAAAAAC CCAGCCCTAT CTCCACTAGG CGAAAACGTG AAGACCAAAG 6780 

6TCAATACTT CTATCAAGTA GCCTTGGACG GAAATGTAGC TGGCAAAGAA AAACAAGCGC 6840 

TCATTGACCA GTTCCGAGCA AAyGGTACTC AAACTTACAG CGCTACAGTC AATGTCTATG 6900 

GTAACAAAGA CGGTAAACCA GACTTGGACA ACATCGTAGC AACTAAAAAA GTCACTATTA 6960 

ACATAAACX3G TTTAATTTCT AAAGAAACA6 TTCAAAAAGC CGTTGCAGAC AACGTTAAAG 7020 

ACAGTATCGA TGTTCCAGCA GCCTACCTAG AAAAAGCCAA GGGTGAAGGT CCATTCACAG 7080 

CAGGTGTCAA CCATGTGATT CCATACGAAC TCTTCGCAGG TGATGGTATG TTGACTCGTC 7140 

TCTTGCTCAA GGCATCTGAC AAGGCACCAT GGTCAGATAA CGGTGACGCT AAAAACCCAG 7200 

CCCTATCTCC ACTAGGTGAA AACGTGAAGA CCAAAG6TCA ATACTTCTAT CAATTAGCCT 7260 

TGGACGGAAA TGTAGCTGGC AAAGAAAAAC AAGCGCTCAT TGACCAGTTC CGAGCAAACG 7320 

GTACTCyUU^C TTACAGCGCT ACAGTCAATG TCTATGGTAA CAAAGACGGT AAACCAGACT 7380 

TGGACAACAT CGTAGCAACT AAAAAAGTCA CTATTAACAT AAACGGTTTA ATTTCTAAAG 7440 

AAACAGTTCA AAAAGCCGTT GCA6ACAACG TTAAGGACAG TATCGATGTT CCAGCAGCCT 7500 

ACCTAGAAAA GGCCAAGGGT 6AAGGTCCAT TCACAGCAG6 TGTCAACCAT GTGATTCCAT 7560 
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ACGAACTCTT CGCAGGTGAT GGCATGTTGA CTCGTCTCTT GCTCAAGGCA TCTGACAAGG 7620 

CACCATGGTC AGATAACGGC GACXSCTAAAA ACCCAGCTCT ATCTCCACTA GGTGAAAACG 7680 

TGAAGACCAA AGGTCAATAC TTCTATCAAG TAGCCTTGGA CGGAAATGTA GCTGGCAAAG 7740 

AAAAACAAGC GCTCATTGAC CAGTTCCGAG CAAACGGTAC TCAAACTTAC AGCGCTACAG 7800 

TCAATGTCTA TGGTAACAAA GACGGTAAAC CAGACTTGGA CAACATCGTA GCAACTAAAA 7860 

AAGTCACTAT TAAGATAAAT GTTAAAGAAA CATCAGACAC AGCAAATGGT TCATTATCAC 7920 

CTTCTAACTC TGGTTCTGGC GTGACTCCGA TGAATCACAA TCATGCTACA GGTACTACAG 7980 

ATAGCATGCC TGCTGACACC ATGACAAGTT CTACCAACAC GATGGCAGGT GAAAACATGG 8040 

CTGCTTCTGC TAACAAGATG TCTGATACGA TGATGTCAGA GGATAAAGCT ATGCTACCAA 8100 

ATACTGGTGA GACTCAAACA TCAATGGCAA GTATTGGTTT CCTTGGGCTT GCGCTT6CAG 8160 

GTTTACTCGG TGGTCTAGGT TTGAAAAACA AAAAAGAAGA AAACTAATCA GCTAAGGAAA 8220 

TAAATGATGG ATAGTGGGCT GACTAAGATT AGTTTAACAA CTCAATCAGC AATCAGGACT 8280 

TTCTTTCAAT AGCAGATTAA AATCATCGTA AAACAATAAA AATAGTGTTA TACTTAAAGC 8340 

AGTATAGCAC TGTTTTTATC AAAGGAGAGA CAGATGGGAA AGACAATTTT ACTCGTTGAC 8400 

6ACGAGGTAG AAATCACAGA TATTCATCAG AGATACTTAA TTCAGGCAGG TTATCAGGTC 8460 

TTGGTAGCCC ATGATGGACT GGAAGCGCTA GAGCTGTTCA AGAAAAAACC GATTGATTTG 8520 

ATTATCACAG ATGTCATGAT' GCCTCGGATG GATGGTTATG ATTTAATCAG TGAGGTTCAA 8580 

TACTTATCAC CAGAGCAGCC TTTCCTATTT ATTACTGCTA AGACCAGTGA ACAGGACAAG 8640 

ATTTACGGCC TGAGCTTGGG AGCAGATGAT TTTATTGCTA AGCCTTTTAG CCCACGTGAG 8700 

CTGGTTTTGC GTGTCCACAA TATTTTGCGC CGCCTTCATC GTGGGGGCGA AACAGAGCTG 8760 

ATTTCCCTTG GCAATCTAAA AATGAATCAT AGTAGTCATG AAGTTCAAAT AGGAGAAGAA 8820 

ATGCTGGATT TAACTGTTAA ATCATTTGAA TTGCTGTGGA TTTTAGCTAG TAATCCAGAG 8880 

CGAGTTTTCT CCAAGACAGA CCTCTATGAA AAGATCTGGA AAGAAGACTA CGTGGATGAC 8940 . 

ACCAATACCT T6AAT6TGCA TATCCATGCT CTTCGACAGG AGCTGGCAAA ATATAGTAGT 9000 

GACCAAACTC CCACTATTAA GACAGTTTGG GGGTTGGGAT ATAAGATAGA GAAACCGAGA 9060 

GGACAAACAT GAAACTAAAA AGTTATATTT TGGTTGGATA TATTATTTCA ACCCTCTTAA 9120 

CCATTTTGGT TGTTTTTTGG GCTGTTCAAA AAATGCTGAT TGCGAAAGGC GAGATTTACT 9180 

TTTTGCTTGG GATGACCATC GTTGCCAGCC TTGTCGGTGC TGGGATTAGT CTCTTTCTCC 9240 

TATT6CCAGT CTTTACGTCG TTGGGCAAAC TCAAGGAGCA TGCCAAGCGG GTAGCGGCCA 9300 
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AGGATTTTCC TTCAAATTTG GAGGTTCAAG GTCCTGTAGA ATTTCAGCAA TTAGGGCAAA 9360 



CTTTTAATGA 


GATGTCCCAT GATTTGCAGG 


TAAGCTTTGA 


TTCCTTGGAA 


GAAAGCX3AAC 


9420 


GAGAAAAGGG 


CTTGATGATT 


GCCCAGTTGT 


CGCATGATAT 


TAAGACTCCT ATCACTTCGA 


9480 


TCCAAGCGAC 


GGTAGAAGGG 


ATTTTGGATG 


GGATTATCAA 


GGAGTCGGAG 


CAAGCTCATT 


9540 


ATCTAGCAAC 


CATTGGACGC 


CAGACGGAGA 


GGCTCAATAA 


ACTGGTTGAG 


GAGTTGAATT 


9600 


TTTTGACCCT 


AAACACAGCT 


AGAAATCAGG 


TGGAAACTAC 


CAGTAAAGAC 


AGTATTTTTC 


9660 


TGGACAAGCT 


CTTAATTGAG 


TGCATGAGTG 


AATTTCAGTT 


TTTGATTGAG 


CAGGAGAGAA 


9720 


GAGATGTCCA 


CTTGCAGGTA 


ATCCCAGAGT 


CTGCCCGGAT 


TGAGGGAGAT 


TATGCTAAGC 


9780 


TTTCTCGTAT 


CTTGGTGAAT 


CTGGTCGATA AOGCTTTTAA 


ATATTCTGCT 


CCAGGAACCA 


9840 


AGCTGGAAGT 


GGTGGCTAAG 


CTGGAGAAGG 


ACCTIGCTTTC 


AATCAGTGTG 


ACCGATGAAG 


9900 


GGCAGG6TAT 


TGCCCCAGAG GATTTGGAAA ATATTTTCAA 


ACGCCTTTAT 


CGTGTCGAAA 


9960 


CTTCGCGTAA 


CATGAAGACA 


GGTGGTCATG 


GATTAGGACT 


TGCGATTGCG 


CX3TGAATTGG 


10020 


CCCATCAATT 


GGGTGGGGAA 


ATCACAGTCA 


GCAGCCAGTA 


CGGTCTAGGA 


AGTACCTTTA 


lOOBO 


CCCTCGTTCT 


CAACCTCTCT 


GGTAGTGAAA 


ATAAAGCCTA 


AAACCCCTTT 


ACAAATCCAG 


10140 


CTATTCATGG 


TAGAATAGAT 


TTTGTGTGAA 


ATATCAGCAG 


GAAAGCATGA 


AGCTCGTCAA 


10200 


CAGGTGTCTT 


ATGACAAGTA 


ACCTTGGCTG 


TTTAGGCGAA 


GGGCATCTGC 


ACGG 


10254 



(2} INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9769 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

CCGGCGACTA TCGATAACAC TTGACTTGGT AGCCCC AC AT TTTGGACAAC GCATCCTTTC 60 

CCTCCTTATC GTTTTCTTTT CATTATACCA TTTTTTAAGC GATTCCCAAA ACAATTCTTC 120 

TTTTTGCTTG ACAAGTTTTT TGTTTTGTTG TATTATTTAA TTAAGACAAC AAGGTAAAAG 180 

AAAGGAGACT AAGATGTCCT GGACATTTGA CAACAAA7WUV CCCATCTATT TACAGATTAT 240 

GGAGAAAATC AAGCTTCAGA TTGTTTCCCA TACACTGGAA CCCAATCAAC AACTTCCAAC 300 

CGTGAGGAGC TAGCTAGCGA GGCTGGTGTC AATCCCAATA CCATCCAAAG AGCCTTATCA 360 

GACCTTGAAC GAGAAGGATT TGTCTACAGC AAGCGAACAA CTGGACGATT TGTGACTAAG 420 

GATAAGGAGC TAATCGCCCA GTCACGCAAA CAATTATCA6 AAGAAGAATT GGAACACTTC 480 
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GTTTCCTCCA TGACCCATTT TGGCTATGAA AAAGAAGAAC TACCAGGCGT AGTCAGTGAT 540 

TATATTAAAG GAGTTTAAGC CTATGTCATT ACTAGTATTT GAAAATGTAT CCAAATCATA 600 

TGGAGCAACA CCAGCCCTTG AAAATGTTTC TCTTGACATT CCAGCTGGAA AAATTGTCXSG 560 

CCTTCTTGGG CX:AAACX3GCT CAGGAAAAAC AACCCTGATT AAACTAATTA ATGGCCTCTT 720 

ACAACCAGAT CAAGGACGTG TCCTCATCAA CGACATGGAC CCAAGCCCAG CAACCAAGGC 780 

CGTTGTAGCT TATTTGCCTG ATACGACCTA TCTCAATGAG CAAATGAAGG TCAAAGAAGC 840 

CCTAACCTAC TTCAAGACCT TCTATAAAGA TTGTCAGATC TTGAACGCGC CCATCATCTA 900 

CTTGCAGACC TGGGCATTGA TGAAAATAGT CGTCTCAAGA AACTATCAAA AGGAAACAAA 960 

6AAAAGGTTC AACTGATTTT GGTTATGAGC OGTGATGCTC GTCTCTATGT TTTGGACGAA 1020 

CCCATTGGTG GGGTGGATCC AGCAGCCCGT GCTTATATCC TCAATACCAT TATCAACAAC 1080 

TACTCACCAA CTTCTACCGT TTTGATTTCT ACCCACTTGA TTTCTGATAT C6AGCCAATC 1140 

TTGGATGAAA TTGTCTTCCT AAAAGACGGA AAAGTCGTCC GTCAAGGAAA TGTAGATGAT 1200 

ATTCGCTACG AGTCAGGTGA ATCCATTGAC CAACTCTTCC GTCAGaATTT AAGGCCTAAG 1260 

CAAAGGAGAT TATTTATGTT TTGGAATTTA GTTCGCTACG AATTTAAAAA TGTTAACAAG 1320 

TGGTATTTAG CCCTCTACGC AGCCGTGCTA GTCCTTTCTG CCCTCATCGG AATACAGACA 1380 

CAAGGCTTTA AAAATCTACC TTACCAAGAA AGTCAGGCTA CTATGCTACT TTTTCTAGCT 1440 

ACAGTCTTTG GTGGCTTGAT GCTTACACTT GGGATTTCAA CCATTTTCTT GATTATTAAA 1500 

CGCTTCAAAG GTAGTGTCTA CGACCGACAA GGCTATCTGA CTTTGACCTT GCCAGTTTCT 1560 

GAACACCATA TCATCACAGC CAAACTAATC GGTGCCTTTA TCTGGTCATT GATTAGCACC 1620 

GCTGTATTGG CTCTAAGTGC TGTTATTATT CTGGCTTTAA CAGCTCCAGA ATGGATTCCT 1680 

CTTTCTTATG TGATTACATT TGTAGAAACA CATCTCCCTC AGATCTTTCT TACAGGTATA 1740 

TCCTTCCTAC TAAATACTAT TTCAGGAATC CTCTGCATCT ACCTGGCTAT TTCCATTGGA 1800 

CAGCTTTTCA ATGAATACCG TACAGCACTC GCTGTTGCAG TCTACATTGG TATCCAAATC 1860 

GTCATTGGAT TTATTGAACT TTTCTTCAAT CTTAGTTCTA ATTTCTATGT CAATTCACTG 1920 

GTAGGACTCA ATGACCATTT CTATATGGGA GCAGGTATAG CCATTGTTGA AGAACTCATA 1980 

TTCATAGCTA TCTTTTATCT CGGAACCTAC TACATCTTGA GAAATAAGGT TAATTTGCTT 2040 

TAAATAATTT TTACCTAOAT ATGTAACATA CTCATAGAAC AAAAGAGACC AGGCAAAAAG 2100 

TCTTTAAAAT TAGAAAAC6C ATAGTATCAG GTGTTGAATA TGTACTGCcC CCCAAAAGTT 2160 

AGATTTTTTC TGTCTAACTT TTGGGGGCAG TTCATAAGAA CCTTGGTAAT ATGCGTTTTT 2220 
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TGTGAGCTGA CTTATTTCCT TTCACTATAT CGCAAAATGA AATAAGAACG GAACGATGGG 2280 

ATTTTGGAAT TCAAATCAAT TTATAAGAAT GTTTTAGAAG TAATATTATC CTATTCCAGA 2340 

TTCAGTTCAC TATACAATTG AGTTTTCAAG CAACCTGTTT ACATAATGTG TACATAATTA 2400 

GGTTCGTGAT TCCACCCTTT TCACCTTTAA AAACCTCGCT TTCGCAAGGC TCTTCTATTT 2460 

ATAAGATAAG GCACGTTTAA AGGTTTTCCA AATCCCTAAA TCATCCGTTT GAAGAACGAG 2520 

ACTAGCATAC ATGCGTCCGA TAAATCCTGT TGCTACCACC GCAAAAATCA CTGTAATAGC 2580 

AAGTGAAATC CATGCTTCTG CTCCCCCCGC ATAGTCATTA ATCGTTCGAA ACGGCATAAA 2640 

GAAGGTCGAA ATAAAG6GAA TATAAGAACC AATCTTCAAG AGGAGATTGT CACCAGCTGC 2700 

ACCTAGAGCT GTCACTCCAA AAAAACCACC CATAATCAAA ATCATCAAAG GCGACAAGGC 2760 

TTTCCCTGAG TCCTCAGGAC GAGAAACCAT AGATCCTAGG AAGGCTGCCA AGACTACGTA 2820 

CATGAAAAGA CTGATCAAAA TAAAGAGCAA GGTATTCAGT GAGATAGCAT CTCCCAAGTG 2880 

ATCCAAAATA CCAGACTGAG CCAAGAATGG CAAATCTTTA AAGAGCAAAA CGGCAGCCAG 2940 

ACCACCTACA ACATAGATCC CAATATGCGT TAAAATCACT AGAAACAGAG CCATCATCCG 3000 

CGCATAGAAA TAGTGACTTG CCCTTATGCT AGAAAAAACG ACTTCCATAA TTTTGGTGCC 3060 

TTTTTCACTG GCAACTTCCT GAGCTGTTAC ACCCGCATAG GTAATCAGAA TCATATAAAG 3120 

AAAGAATCCT AAGGCACCTG CTGCAATTGT TTGAATAAAC TTTTTATTTT CCTTGGCTTC 3180 

ATCAATCTTT TCTGTGAATT GAATTGTCTG CGCTAAGCGT TTTTCCTGCT CTTGAGACAA 3240 

GGAAGCAGTT GAACGATTAA GCTGATTTTG CAGTTCATTG AGTGTACCTG TAACCTCAAA 3300 

TTTAATTCCA TTTTCAAGCG ATGTTTCGCC ATGATAAACT GCCTTTAGAA CACTATCTTC 3360 

TTGATCAATG GTCAAATAAC CTTTTAATTT TTCTTCTTTA ATTGCTTCTT TGGCACTTGC 3420 

TTCGTCTTTA TAGTCGAAGT TAACACCATT TACATTCTTC AGTCCTTCTG CTACAGATGG 3480 

CACTGTTGTC ACTACTGCCA CTTTATTATT TTTAGCCATA GAAGAACCTT GGAGATGCCC 3540 

AATTCCTACA GAGATTCCTA AAAAGAGGAA CGGCGAAATC ACCATAAAGA AGAAACTCCA 3600 

TGACTCGACA TGTCGAAGAT AGGTTTCCTT GATTACAACC CACATATTTC TCATACTTCC 3660 

ACTCCTGATT CTAGTTTAAA GATTTCATCG ATAGTTGGCG CTTGTTGGTC AAATGTTGCG 3720 

ATATATTGAC CTTGAGTCAA GATTGAGAAG AGTTCCCTTC CAGCGCTCTC ATCCTCCAAA 3780 

ATCAATTTCC AACTGCCTTG TTTGGTCAAG CTCACCTGTT TGACATGAGG AAGATTTTCC 3840 

AATTCTTCCT TGCTTCGTTC ACTTGAAACA AAGAGACGCG TTTTCCCGTA TTQATTGCGG 3900 

ACATCCTGAA CTGGTCCGTG CAAGACCACA CGGCCATCTC GGATCATCAG AATATCGTCA 3960 

CAAAGTTCCT CAACATTGGT CATGACAT6G TCAGAAAAGA TAATGGTTGT CCGCGCTCTT 4020 
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TTTCCTGAAA AATGACTTGT TTGAGCAATT CTGTATTAAC TGGGTCCAAT CCACTAAAAG 4080 

GCTCATCCAA GATAATCAGG TCTGGTTCAT GAATCAGAGT AATAATGAGC TGAATCTTCT 4140 

GCTGATTTCC TTTTGACAGA CTCTTGATTT TATCTGTCAG CTTTCCTTTC ACTTCCAACC 4200 

tcttcatcx:a TTGAGGGAGT TTTTCTTTGA CTTCTTTGGC ATCCATGCCT TTTAGAGTCG 4260 

CCAAGTAGCG AACTTGTTCA AGAACTGTCA ATTTAGGCAT GAGATGCGTT CTTCAGGCAG 4320 

ATAACCAATC CGAGCATAGG TCTCCTGACG AATATCCTGA CCATCCAGAC CGATTTCTCC 4380 

CTGATATTCT AGGAATTTCA AAATACTATG GAAAATCGTT GTTTTTCCAG CACCATTTTT 4440 

TCCGACTAGT CCCAAAATAC GACCTGGTCG CGCTTGAAAG TCAATACCAA ACAAAACTTG 4500 

CTTGGATCCA AAACTTTTCT CTAGACTTCT TACTTCTAGC ATCTTTCACC TCCGAAATTT 4560 

CTTGCACTCA TTATACTCCT TTTTGATAGC CTTTACAATG TTTTTTGTCC ATTTTTAGAA 4620 

GACTATTGCT GTGTAAAATA TGGCCTGGAG CACTTTTATA CTCAATGAAA ATCAAAGAGC 4680 

AAACTAGGAA GCTAGCCGTA GACTGCTCAA AGTACAGCTT TGAGGTTGCA GATAAAACTG 4740 

ACGAAGTCgA CTCAAAACAC TGTTTTGAGG TTGTG6ATAG AACTGACGAA kCrTAaCTAT 4800 

ATCTACGGCA AOGCGAAcTG ACGTGGTTTG AAGAGATTTT CXSAAGAGTAT TAGTGATAAA 4860 

TCCATTATAC AGCAGCAAAC TTAATTTATA CCTTCCGCTC CTCAACTGTC TATTTTTAAT 4920 

CCTGAATTGT TATTTGAGTA ACTCCTTTTT CCTCGTAAAG TTTTCTTCCT CTAAAACTTC 4980 

TGGAAAAAGG CTAATAGTTT CAGACAACAT TTTTATAAGA AACAAGTTCA TCTGTCATTT 5040 

CAAGAAGGAG TAATCCTTTA TCTACTAATG GACGGAACAG AATTCAACCG CTTGTCCGAT 5100 

ATGTTTTCTA AGGATTATAT AGTAAAATGA AATAAGAACA GGACAAATTG ATCAGGACAG 5160 

TCAAATTGAT TTCTAACAAT GTTTTA6AAG TAGATGTATA CTATTCTAGT TTCAATCTGC 5220 

TATATCTATT ATGCACACCC CTATAGGATC TAATGAAAAT CACAACAGGC TCATTCATAG 5280 

ATGGTTACCT AAGCCTAAGG GAACTAAGAA AACGACTACC AAGGAAGTCG CATTCATCGA 5340 

AAAGTAGATT AACAACTATC CTAAAAAATG CTTGAACTAC AAGTCCCCCA GAGAAGACTT 5400 

CTGGATGACT AACTTGAACT TGAAATTTAG CAATAATTAA TTCACTATCT AACTATATTT 5460 

AGTAATTATT TCAGAACTGA TTAATATTAA AATTAACTAA CAATTCAAAG GATTCATACT 5520 

AGCCATAAAT TACGTCCATC AGAGAGAGAC TCTTACTACT TTTAGATTTT AGTCTTTCTA 5580 

GCTTCAGAAT ACATCTAAAC TTTAGGGAAA ATGACTATTC GAAAGCGCGA ATGCCTCAAA 5640 

ATTATCTCAG ATAAGCTATT CGAAACTTAG AATGCTTTTA AATTTATGGA ATTGCGATTA 5700 

TTCGAAACCT AGAATGCATA TAACCTTTAG TTGACAGACC TATTCTAAGT CTCGAAGGGC 5760 
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TATTTACTTT CTATTCCTTA TCAAAAAAGA CTCATTCCCC CTTTCTCCTC CAAAATATGG 5820 

TATAGTAGAA ATATACTATC TATGAGGAGT TTACATGTCA CAGGATAAAC AAATGAAAGC 5880 

TGTTTCTCCC CTTCTGCAGC GAGTTATCAA TATCTCATCG ATTGTCGGTG GGGTTGGGAG 5940 

TTTGATTTTC TGTATTTGGG CTTATCAGGC TGGGATTTTA CAATCCAAGG AAACCCTCTC 6000 

TGCCTTTATC CAGCAGGCAG GCATCTGGGG TCCACCTCTC TTTATCTTTT TACAGATTTT 6060 

ACAGACTGTC GTCCCTATCA TTCCAGGGGC CTTGACCTCG GTGGCTGGGG TCTTTATCTA 6120 

CGGGCACATC ATCGGGACTA TCTACAACTA TATCGGCATC GTGATTGGCT GTGCCATTAT 6180 

CTTTTATCTA GTGCGCCTAT ACGGAGCTGC CTTTGTCCAG TCTGTCGTCA GCAAGCGCAC 6240 

CTACGACAAG TACATC6ACT GGCTAGATAA GGGCAATCGT TTTGACCGCT TCTTTATTTT 6300 

TATGATGATT TGGCCCATTA GCCCAGCTGA CTTTCTCTGT ATGCTGGCTG CCCTGACCAA 6360 

GATGAGCTTC AAGCGCTACA TGACCATCAT CATTCTGACC AAACCCTTTA CCCTCGTGGT 6420 

TTATACCTAC GGTCTGACCT ATATTATTGA CTTTTTCTGG CAAATGCTTT GACACGTAAA 6480 

AAATCCGTTT GGTTTCCCAA GTGGATTTTT AAAGCGTAGA TTAACTATAG CTTGATACTA 6540 

AATATACTTT GGTATGGAAA TCATGCATAT TTTTCGATAG TGAGGCGAGG ACTTACCTAG 6600 

CCTTTCCGCC GTGATAGAAA CACCTGAAAT CTAATGGTTT CAGGTATTCG GAAACTTTGA 6660 

GCCTAGTGTC TCAAAGTTTA GGTATGGAAT TTTGAAGAAA GTCGCTACCG TCCGTAATCA 6720 

CTTAAGGAAA GGCTCAAAAA TATTGTTTTC AACCACAAAA TCCGTTTGGT TTCCCAAGCG 6780 

GATTTTGTGC TTTATTTT6A AACTTCTTTT GCAAGAACAA AGTTCCCAAG TGTGGCAGAA 6840 

CCATTTCCTG CGACT6CTGG CGTCACGATA TAGTCACGCA CATCTGGTAC TGGTAGGTAA 6900 

CCATTAAGAA GAGAT6TAAA TTTCTCACGG ACACGGTCCA GCATATGTTG TTGAGCCATG 6960 

ACCCCTCCAC CAAAGACAAT CACGTCTGGG CGGAAAGTCA CTGTCGCATT AACCGCAGCT 7020 

TGAGCGATAT AGTAGGCTTG AACATCCCAA ACAGGGTTGT TGAGTTCAAT AGTTTCCCCA 7080 

CGTACACCTG TACGAGCTTC CAAACTTGGA CCAGCTGCAT AACCTTCTAG ACATCCCTTA 7140 

TGGAAAGGAC AAACACCCTT AAACTCTTTT TCAATATCCA TTGGGTGTCT AGCAACATAA 7200 

TAATGACCCA TTTCAGGGTG ACCCACACCA CCGATAAACT CACCACGTTG GATGACGCCT 7260 

GCACCGATAC CTGTACCGAT TGTGTAGTAA ACCAAGTTTT CGATACGACC ACCAGCATTG 7320 

TTACGGGCAA CCATTTCACC GTAAGCAGAG CTGTTTACGT CTGTTGTGAA GTACATTGGC 7380 

ACGTTTAGGG CGCGACGAAG GGCACCAAGC AAGTCTACAT TTGCCCAGTT TGGTTTTGGA 7440 

GTCGTCGTGA TAAAGCCATA AGTTTTTGAG TTTTTGTCAA TATCAATCGG CCCAAATGAA 7500 

CCAACTGCAA GACCAGCAAG GTTATG6AAT TTTGAGAAGA ACTCAATGGT TTTATCGATT 7560 
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GTTTCGATTG GAGTTGTTGT TGGAAATTGT GTTTTTTCTA CAACGTTAAA GTTTTCATCA 7620 

CCGACAGCAC AGACAAACTT TGTACCGCTC GCTTCCAAGC TTCCATATAA TTTTGTCATG 7680 

ATAAACCTCT TGTTTTTATT TTCTTTATTA TAGCATACTT CGAAAGTCTA AATGTCTCTA 7740 

TTTTTTAGAT TTTCCTCTGT AAATCTTACT ATCTAATAAA AACGAACAAA CATGTCATTT 7800 

GTTCGTTTTC ACATTAGAGA GGATTGATTA GATTTTCACT TCGATCACAG CATCCCCCTT 7860 

AGCAACTGAA CCTGTTGCGA CTGGAGCTAC TGAAGCGTAG TCACCTGTAT TTGTAACGAT 7920 

AACCATTGTT GTATCATCAA GTCCAGCTGC AGCGATTTTG TTTGAGTCAA ATGTTCCAAG 7980 

AACATCGCCA GCTTTCACCT TATTACCTTG AGCAACTTTT GTTTCAAAAC CGTCACCGTT 8040 

CATAGATACA GTATCAATAC CAACATGAAT CAAAACTTCA GCACCATTTC TTGTTTTCAA 8100 

ACCAAAAGCG TGCCCTGTTG GAAAGGCAAT TGAAACTTCA GCATCAGCTG GTGCATAGAC 8160 

CACGCCTTGG CTTGGTTTCA CAACGATACC TTGTCCCATA GCTCCACTTG AGAAGACTGG 8220 

GTCATTGACA TCAGCAAGAG CGACAACATC ACCGACGATA GGAGTTACAA GTGTTTCATT 8280 

TTGAAGAGCT GCTGGCGCAA CTTCTTCTTT TTCTTCAGCC ACTTCAGCTC GTTTTGCAGC 8340 

TGCAGTTGCG TCTACTTCAT CTTCGTAACC AAACATGTAA GTAAGAGCAA AACCAAGGGC 8400 

AAATGATACA GCTACCATAA GAAGGTATTG TGGAAGTTGT CCGTTACCAA CATAAAGCAT 8460 

TGTACCAGGG ATGATGGTGA TACCATTACC AGTACCAGCA AGTCCAAGGA TAGAAGCCAA 8520 

TCCACCACCG ATTGCACCAG CAATCAATGA AAGGAAGAAT GGTTTACGGA AGCGCAAGTT 8580 

CACCCCGAAG ATAGCAGGCT CTGTAATACC TAGGAAGGCA GAAAGAGCAG CCGGGAAAGC 8640 

AA6TGTTTTC AGTTTTGGAT TTTTTGTTTT AACACCAACC GCAACAGTAG CAGCACCTTG 8700 

AGCTGTCATA GCAGCTGTGA TGATAGCGTT GAATGGGTTA GCATGGTCAG CAGCAAGTAA 8760 

TT6CACTTCA AGCAAGTTGA AGATGTGGTG CACACCTGAC ACGACGATCA ATTGGTGAAC 8820 

CCCACCAATC AAGAAACCAC CAAGACCAAA TGGCATGCTA AGAATCGCTT TTGTAGCAAT 8880 

AAGGATGTAG TTTTCAACAA CGTGGAAAAC TGGTCCAATG ACAAAGAGTC CAAG6ATAGA 8940 

CATGACCAAA AGTGTCACGA ATGGTGTTAC CAAGAGGTCA ATGACATCTG GAACAACTTG 9000 

CGGACAGCTT TTTCAAATTT AGCTCCGACA ACCCCGATGA TGAAGGCTGG AAGAACGGAA 9060 

CCTTGCAAAC CAACAACAGG GATGAAACCA AAGAAGTTCA TCGCTGTTAC TTCACCACCT 9120 

TGAGCAACTG CCCAAGCGTT TGGAAGTGAG CCAGAGACAA GCATCATACC AAGAACGATA 9180 

CCAACGGCAG GATTTCCACC AAATACACGG AAGGTTGACC ACACAACCAA ACCTGGCAAG 9240 

ATGATGAAGG CTGTATCT6T CAAGATTTGT GT6TAAGTTG CAAAGTCACC TGGAAGTGGC 9300 
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ATTTCAAGAG CGTTGAAAAG ACCACGCACA CCCATGAAGA GACCTGTCGC TACGATAACT 9360 

GGGATGATTG GAACGAAAAC ATCACCAAAA GTACGGATAG CACGTTGGAA CCAGTTCCCT 9420 

TGTTTAGCAA CTTCTGCTTT CATGTCATCC TTAGATGATG TTGGTAATCC AAGTACAACA 9480 

ACTTCATCGT ACATTTTGTT AACTGTACCT GTACCAAAGA TAATTTGGTA TTGCCCTGAG 9540 

TTAAAGAAAG CACCTTGAAC TTTTTCCAAG TTCTCAATCA CTTCTTTATT GATTTTCTCT 9600 

TCATCTTTGA CCATGACACG TAGACGAGTC GCACAGTGGG CAACACTATT GACATTTTCA 9660 

CGTCCGCCCA AGGCATCGAT GACTTTTTTT GCAATTTCCT GATTGTTCAT TTGCAAAAAT 9720 

CTCCTTATAT AACATTTTGT TCTTGTTTGA AAGCGATTTT ATTCGCCGG 9769 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3149 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

CGCTTGAGTG CTAATTCATA GTTCTATTGT ATCACTTGGT CAGAAATAAT CAAGAAAAAA 60 

GTCTGACTTT CTCAAGATAA AAAGCCTGAG ACCAACTCAG ACTTTTTAAT TCTTAAAATG 120 

GCAATTCTTC CTCTTCCAAG ACCAAATCTG CCAAATCTTG GCCTGCATTA TTTTCACGCA 180 

TAGCACGTTG GGCACGACTT TCCAAGAGTT GGAATCCTGT GACAAGTACT TCGGTCACGT 240 

AGTTCATTTG GCCATTTTTC TCAAAGCGAC GGGTACGCAA TTCTCCATCA ACGGAAATGA 300 

GACTACCTTT GGTTGCGTAC TTGCCAAAGT TTCTGCTAGT CTGCCCCATA GGACCATATT 360 

GACAAAATCA GCTTCACGTT CACCGTTTTG GTCTTTGTAA CGACGGTTCA CAGCGATAGT 420 

TGCTCGCGCT ACCGACTTGT CATTGTTGGT TTTGTGCAAT TCTGGTGTAG ACGTTAAACG 480 

TCCAATCAAG ATAACTTTAT TATACATATT TTCTTCCTCC TACTTATCTA TTCGTAGG/A 540 

ATCAAAAAAA GTTACAGAAA TTTGTAACTT TTCGAGAAAA TTTTTTATTT TTTATGAACC 600 

ATGAAACCTG TCGCCTGTTG ATTGGCCATA ATGGTCATAT CTGTAATCTG AACACGACGA 660 

GGTTGACTAG TCACATAGAC TACTGTATCT GCAATATCCT GAGCTTGCAA AGCTTCTATT 720 

CCTTGGTAAA CG6ACGCAGC TCGTTCTTTA TCACCATGAA AACGCACTGT AGAAAAATCT 780 

GTTTCGACAA TTCCAGGCTG AATGGTCGTC ACCTTGATAT CCGTTGCGAT GGTATCAATT 840 

CGCAGTCCAT CTGAAAAGGT CTTAACTGCC GCCTTGGTGG CTGAGTAAAC AGCTGCACCA 900 

GCATAGGCAT AAATTCCTGC GGTTGACCCC ATATTGATAA TATGACCTTG ATTGGCTTTT 960 
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ACCATTGCTG 


GCAAGAAACA 


GCGAGTGACT 


GCCATCAAAC CTTTGACATT GGTATCCAAC 


1020 


ATGGTCAGCA TATCCAACTC 


TTCATAGTCT 


TGATAGGGAG CTAAGCCAAG AGCCAGTCCT 


1080 


GCGTTATTGA 


CCAGGATGTC 


AATCTGACCT 


ATCGTTTCTA AAATATCAGA GCAGACAGTC 


1140 


TTTACCATTG 


TCATATCCGT 


GACATCTAGG AGAAAAGTCC AAACTGTTTG ATTTGGAAAA 


1200 


GTTTCTGCAA 


ACTCCGCCTT 


AAGAGCTTCT 


AGTCTGTCTA 


TCCGTCGTCC TGTTAGAACG 


1260 


ACATCCTCAC 


CCTGCTCCAG 


ATAAGCACGC 


GCAATCGCTT 


CACCGATTCC TGATGTCGCT 


1320 


CCTGTAATCA 


CAACATTTTT 


TGCCATCTTA 


TTTCCTTCTA 


GCTGGTCTAT CAGATATTAA 


1380 


CAACTTCTTA 


GGCAGTCCAG 


TGTTTCGCTG 


GGTCGAACGG 


TGTTCCGACA ACTTGGTCTT 


144 0 


CTGATAATTC 


AAGCACCCCA 


CGTTTTTGTG 


GAGCATTTGG 


CAGATGCAAT TCACGAGGAC 


1 ^nn 


TGCACATCAT 


ACCAAAACTC 


TTTTCACCAC 


GAAGTTCACC 


TGGGAAAATG AGATTCCCTT 


1 RAH 


TTGGCATCAT 


AGCTCCAGGA 


AGCGCGACAA 


TGGTTTTCAA 


CCCCACACGC GCATTGGGAG 


1620 


CTCCTGCAAC 


GATTTGTACA 


GTCTTATCAC 


TTGCGACTGC AACTTGGCAG ATGTTGAGGT 


1680 


GGTCACTATC 


TGGATGGGCT 


ACCATCTCAA 


CAATTTCACC 


TACAACAAAC TTAGGTTCCT 




TATCATTAAC 


AATTTCTTCT 


GTAAAACCTT 


CCGCCTGCAA CTCTTGGTTC AAACGAGCGA 


1800 


CTTGCTCATC 


TGTCAAAAAG 


ACTTGACCGC 


GCTCTGCAAT 


TTCAAATAAA CTTGAAACTT 




CGAAAATATT 


CCAAGCCACT 


GTTTCCCCAT 


TATCTTTGAG 


AAAAACACGG GCTACCTTGC 


1920 


CTTTGCGCTC 


CACATCCAGT 


TTGGCATCTC 


CGCTATTTTT 


CACGATGACC ATAAGGACAT 


1980 


CACCGACATG 


TTCTTTATTA 


TATGTAAAAA 


TCATTGTTTC 


CTTTTTCTCC TATTTCAGTC 


2040 


CTGCTAAAAA GTCATTGATT TGTTGCTTGC TTTTACGGTC GCGATTGACA AAACGACCGA 


2100 


TTTCCTTGTC 


CTTTTCTAGA 


ACAACAAGGC 


TAGGAATTCC GTAAACATCC CAGAGTTTGG 


2160 


CCAAATCCAT 


ATACTGATCT 


CGGTCCATTC 


GAATAAAGGT 


GAACTCTGGA TTGGTCTCCT 


2220 


CAATCTCTGG 


TAAGGCAGGA 


TAAATATAAC 


GACAATCGCT 


ACACCAGTCT GCCACAAAAA 


2280 


TGAAGACCTT 


CTTGCCCGCT 


TTTTCCACTA AAGATGCTAA 


TTCTTCTAAA CTTGCTGGCT 


2340 


GTATCATAAG 


ACTTCCTCCT 


CATAGACTAG 


GTCTTCATTT 


TCATAGACAA AGGTATAATG 


2400 


ACGGCCATCC 


TCAAAAATGA 


CGCCACCAAC 


CAAGCTCTCC 


AGACTGCTTT CX3TAAACTTG 


2460 


AACATAAAGG 


GTCGCAATTT 


CCCCCATGTC 


GGAAAAATGG 


TCTCGCACAA TCTCTGTCAA 


2520 


CTCTTCCTGA 


GTCTTCATGA 


GCTTACGGTC 


ATCTGCAACT 


TTTTTCGTAG CAAGAGCAAG 


2580 


GCTTCCGATA 


CCTAGCAGAG 


CCAAGCCTGC 


CATCCACATT 


TTTTTAGCTT TCATACCATT 


2640 


CATTTTAACA CAAAAAAGGC TTCAGGACAA ATGAGGAAGC AGCAGAAAAG CAAGTAAAAA 


2700 
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GCCTCTTCCT TTAAGGAAAA GGACTTCTTA TACTCAATGA AAATCAAAGA CCAAACTAGG 2760 

AAGCTAGCCG CAGGCTGCTC AAAGCACTGC TTTGAGGTTG TAGATAGAAC TGACGAgTCa 2820 

CTCAAAACAC TGTTTTGAGG TTGTGGATGA AGCTGACGTG GTTTGAAGAG ATTTTCGAAG 2880 

AGTATTATTC TTATTGCCAG GCACCTAAGT TGCCAACGTA GTAACTATCA GGTGTGTAGG 2940 

TATTGCGAGC ATCTTACCTG ATGAAGCCAG ATAATACTAC TTGCCATTGT CTTTGACCCA 3000 

ATCATTCGCA ATCATGGAAC CAGAAGAACT TACATAATAC CATTCTCCCT TGTCATAAAC 3060 

CCAAGTACTG ACTTTCATGG TTCCTGAGCA ATTAAAGGCA AAAAAACTGT CCAATAACAT 3120 

TCGTTTTTTA AAAGCATTTG ACACTACAT 3149 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10240 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

CCAAAAATTC AACCTTTAAG GGGAGTCCAG AGAGACTCAC AAGGTGTCAG ATAAAAGAAT 60 

GGTGCAATTT TCTAGAGGAG ACTTTTTGAG TGTGCTCTCT TGTGTTGTAC GATTTTAACT 120 

GAGGCCTTGC ACTAGCAAGG TCTTTTCTTT ATCTGGTCCC CTTAAAATTT AAGGAGGAAA 180 

AGTTAT6AAT CCCACATGTA AGAAGCGTTT GGGTGTCATT CGGTTGGAAA CCATGAAGGT 240 

GGTTGCACAA GAGGAAATCG CGCCACAATC TTTGAATTAG TCCTAGAAGG AGAAATGGTT 300 

GAAGCCATGC GAGCAGGCCA ATTTCTTCAT CTGCGTGTAC CGGACGATGC CCATCTCTTA 360 

CGTCGTCCTA TTTCAATTTC GTCTATTGAC AAGGCAAACA AGCAGTGTCA CCTCATTTAT 420 

CGGATTGACG GAGCTGGGAC TGCAATTTTT TCAACCTTAA GTCAGGGAGA CACTCTTGAT 4 80 

GTGATGGGGC CTCAGGGAAA TGGTTTTGAC TTGTCTGACC TTGATGAGCA GAATCAGGTT 540 

CTCCTTGTTG GTGGTGGGAT TGGTGTTCCA CCCTTGCTTG AGGTGGCCAA GGAATTGCAT 600 

GAACGTGGAG TGAAAGTAGT GACAGTCCTC GGTTTTGCTA ATAAGGATGC TGTTATTTTG 660 

AAAACGGAAT TGGCTCAGTA TGGTCAGGTC TTTGTAACGA CAGATGATGG TTCTTATGGC 720 

ATCAAGGGAA ATGTTTCCGT TGTTATCAAT GATTTAGACA GTCAGTTTGA TGCTGTTTAC 780 

TCGTGTGGGG CTCCAGGAAT GATGAAGTAT ATCAATCAAA CCTTTGATGA TCACCCAAGA 840 

GCCTATTTAT CPCTGGAATC TCGTATGGCT TGTGGGATGG GAGCTTGCTA TGCCTGTGTT 900 

CTAAAA6TAC CAGAAAACGA GACGGTCAGC CAAC6CGTCT GTGAAGATG6 TCCTGTTTTC 960 
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CGCACAGGAA CAGTTGTATT ATAAGGAGAA AATTATGACT ACAAATCGAT TACAAGTTTC 1020 

TCTACCTGGT TTGGATTTGA AAAATCCGAT TATTCCAGCA TCAGGCTGTT TTGGCTTTGG 1080 

ACAAGAGTAT GCCAAGTACT ATGATTTAGA CCTTTTAGGT TCTATTATGA TCAAGGCGAC 1140 

AACCCTTGAA CCACGTTTTG GGAATCCAAC TCCAAGAGTG GCAGACACGC CTGCTGGTAT 1200 

GCTCAATGCA ATTGGCTTGC AAAATCCTGG TTTAGAGGTT GTTTTGGCTG AAAAGCTACC 1260 

TTGGCTGGAA AGAGAATATC CAAATCTTCC TATTATTGCC AATGTAGCTG GTTTTTCAAA 1320 

ACAAGAGTAT GCAGCTGTTT CTCATGGGAT TTCCAAGGCA ACTAATGTAA AAGCTATCGA 1380 

GCTCAATATT TCTTGTCCCA ATGTTGACCA CTGTAATCAT GGACTTTTGA TTGGTCAAGA 1440 

TCCAGATTTG GCTTATGATG TGGTGAAAGC AGCTGT6GAA GCCTCAGAAG TGCCAGTTTA 1500 

TGTCAAATTA ACCCCGAGTG TGACC6ATAT CGTTACTGTC GCAAAAGCTG CAGAAGATGC 1560 

GGGAGCAAGT GGCTTGACCA TGATCAATAC TCTGGTTGGA ATGCGCTTTG ACCTCAAAAC 1620 

TAGAAAACCA ATCTTGGCCA ATGGAACAGG TGGAATGTCT GGTCCAGCA6 TCTTTCCAGT 1680 

AGCCCTCAAA CTCATCCGCC AAGTTGCCCA AACAACAGAC CTGCCTATCA TTGGAATGGG 1740 

AGGAGTGGAT TCGGCTGAAG CTGCCCTAGA AATGTATCTG GCTGGGGCAT CTGCTATCGG 1800 

AGTTGGAACA GCTAACTTTA CCAATCCTTA TGCCTGCCCT GACATCATCG AAAATTTACC 1860 

AAAAGTCATG GATAAATACG GTATTAGCAG TCTGGAAGAA CTCCGTCAGG AAGTAAAAGA 1920 

GTCTCTGAGG TAAACTGCAA TCAATCTGTT CTTGATTTTT TATTAGTTTG TAATATGAAT 1980 

TTAGGAGAAT TTTGGTACAA TAAAATAAAT AAGAACAGAG GAAGAAGGTT AATGAAGAAA 2040 

GTAAGATTTA TTTTTTTAGC TCTGCTATTT TTCTTAGCTA GTCCAGAGGG TGCAATGGCT 2100 

AGTGATGGTA CTTGGCAAGG AAAACAGTAT CTGAAAGAAG ATGGCAGTCA AGCAGCAAAT 2160 

GAGTGGGTTT TTGATACTCA TTATCAATCT TGGTTCTATA TAAAAGCAGA TGCTAACTAT 2220 

GCTGAAAATG AATGGCTAAA GCAAGGTGAC GACTATTTTT ACCTCAAATC TGGTGGCTAT 2280 

ATGGCCAAAT CAGAATGGGT AGAAGACAAG GGAGCCTTTT ATTATCTTGA CCAAGATGGA 2340 

AAGATGAAAA GAAATGCTTG GGTAGGAACT TCCTATGTTG GTGCAACAGG TGCCAAAGTA 2400 

ATAGAAGACT GGGTCTATGA TTCTCAATAC GATGCTTGGT TTTATATCAA AGCAGATGGA 2460 

CAGCACGCAG AGAAAGAATG GCTCCAAATT AAAGGGAAGG ACTATTATTT CAAATCCGGT 2520 

GGTTATCTAC TGACAAGTCA GTGGATTAAT CAAGCTTATG TGAATGCTAG TGGTGCCAAA 2580 

GTACAGCAAG GTTGGCTTTT TGACAAACAA TACCAATCTT GGTTTTACAT CAAAGAAAAT 2640 

GGAAACTATG CTGATAAAGA ATGGATTTTC GAGAATGGTC ACTATTATTA TCTAAAATCC 2700 



wo 98/18931 



PCT/US97/19588 



328 

GGTGGyTACA TGGCAGCCAA TGAATGGATT TGGGATAAGG AATCTTGGTT TTATCTCAAA 2760 

TyTGATGGGA AAATrGCTGA AAAA6AATGG GTCTACGATT CTCATAGTCA AGCTTGGTAC 2820 

TACTTCAAAT CCGGTGGTTA CATGACAGCC AATGAATGGA TTTGGGATAA GGAATCTTGG 2880 

TTTTACCTCA AATCTGATGG GAAAATAGCT GAAAAAGAAT GGGTCTACGA TTCTCATAGT 2940 

CAAGCTTGGT ACTACTTC7VA ATCTGGTGGC TACATGGCGA AAAATGAGAC AGTAGATGGT 3000 

TATCAGCTTG GAAGCGATGG TAAATGGCTT GGAGGAAAAA CTACAAATGA AAATGCTGCT 3060 

TACTATCAAG TAGTGCCTGT TACAGCCAAT GTTTATGATT CAGATGGTGA AAAGCTTTCC 3120 

TATATATCGC AAGGTAGTGT CGTATGGCTA GATAAGGATA GAAAAAGTGA TGACAAGCGC 3180 

TTGGCTATTA CTATTTCTGG TTTGTCAGGC TATATGAAAA CAGAAGATTT ACAAGCGCTA 3240 

GATGCTAGTA AGGACTTTAT CCCTTATTAT GAGAGTGATG GCCACCGTTT TTATCACTAT 3300 

GTGGCTCAGA ATGCTAGTAT CCCAGTAGCT TCTCATCTTT CTGATATGGA AGTAGGCAAG 3360 

AAATATTATT CGGCAGATGG CCTGCATTTT GATGGTTTTA AGCTTGAGAA TCCCTTCCTT 3420 

TTCAAAGATT TAACAGAGGC TACAAACTAC AGTGCTGAAG AATTGGATAA GGTATTTAGT 3480 

TTGCTAAACA TTAACAATAG CCTTTTGGAG AACAAGGGCG CTACTTTTAA GGAAGCCGAA 3540 

GAACATTACC ATATCAATGC TCTTTATCTC CTTGCCCATA GTGCCCTAGA AAGTAACTGG 3600 

GGAAGAAGTA AAATTGCCAA AGATAAGAAT AATTTCTTTG GCATTACAGC CTATGATACG 3660 

ACCCCTTACC TTTCTGCTAA GACATTTGAT GATGTGGATA AGGGAATTTT AGGTGCAACC 3720 

AAGTGGATTA AGGAAAATTA TATCGATAGG GGAAGAACTT TCCTTGGAAA CAAGGCTTCT 3780 

GGTATGAATG TGGAATATGC TTCAGACCXTT TATTGGGGCG AAAAAATTGC TAGTGTGATG 3840 

ATGAAAATCA ATGAGAAGCT AGGTGGCAAA GATTAGTACT ATAAGTGAAT ATGATTTGAG 3900 

TGAATA6TAA GTTAAAAATC CTGATTTCAA GTAAAATCAG GATTTTTTCA TGGATGCAAT 3960 

TTTTTTGGAG TCTGGTGTGA CGCGGAGGGT CTTTTGTCCT GTGTAAGTGA CAAAGCCX3GG 4020 

TTTTCCACCA GTTGGTTTAT TGAGTTTTTT GACTTCAATC ATATCTACCT GCACCAGATT 4080 

CGACAGGCGC CCTTGAGAGA AGTAGGCAGC TAACTCTGCT GCGTCTGTCT TGACTGCATC 4140 

AGATGGGTCA AGATTTCCTG AGATGACAAC ATGGCTTCCA GGAATGTCCT TAGCATGGAA 4200^ 

CCAAAGTTCC TCCTTGCGGG CCATTTTAAA GGTCAATTCC TCATTTTGAA 6ATTGTTTCG 4260 

TCCGACATA6 ATGAT6GTTT TGCCATCGCT TGCTAGATAT TGTTCTAGTT TTTTGCGTTT 4320 

CTGGATTTTC TCCCGTTGTC TTCTGCGGAT AAAACCTGTT TGAATCAATT CTTCACGGAT 4380 

TTCAGCGATT TCTTCCAGTC CAGCTTGGTT GAGGACGGTT TCTACACTTT CCAGATAGAG 4440 

AATAGTGGCT TTGGTTTCTT CAATCAAATC AGTCAAGTAT TTGACAGCTT CTTTGAGTTT 4500 
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CTGATACCGT TTAAAATAGC GTTGGGCATT CTGGTTGGGA GTCAGAGCCT TATCAAGCGC 4560 

AATCATGATA GGTTGGTTGG TATAGTAGTT GTCTAGGATA ACCTGGTCTT GGTCGTTAGG 4620 

CACTTGGTGG AGGAAGGTTG TCAGCAATTC TCCTTTTTGA CGAAATTCTT CAGCGTTGTC 4680 

TGTCGCCAGT AACTCTTTTT CCTGTTTTTT GAGTTTGTGT CGGTTTTTCT GAAGTTCATT 4740 

TTCAACACGA CGAATCAGTT CACTGGCCTG CTGTTTGACG CGGTCX3CGCT CAGCCTTATC 4800 

CTTATAGTAG GTGTCCAACA AATCAGAAAG ATTTGCAAAA GGCTCTCCCA CCTGATTTGC 4860 

AAAAGGAACT GGACTGAAGG AAGTCTCAGT CAAGCATGGC TTGGTTTCTT GATTGAAAAA 4920 

ATTTCGGAAA GCGGAAAGTT TTTCACTAAC CAGTATCCTT TCCAATTCAT TTGCCGTATC 4980 

GCGTCCCAGA CCTTGAAAGA GGCTTTGAAG ATTTTTTGCT GTTAGTTCTT GGGTTTGCAG 5040 

GATTTCAAAG AGCTTTTCAT CCTTGATAGT AAAAGGATTG AGAGATTTTG TACTTGGCGG 5100 

AGCGATATAG GTCGATCCTG GAAGTAAGGT GCGGTAGCTA TTTTGTGAAA AGCCGACGTG 5160 

TTTGATAACT TCGAQGATTT TATGACTGCT TTTATCGACC AGTAGAATAT TACTGTGTTT 5220 

CCCCATAATT TCGATAATCA AGGTAGCCTG GATATGGTCT CCAATCTCGT TTTTATTGGA 5280 

AACTGTAATT TCCACAATAC GGTCATTTTC CACTTGCTCA ATCGACTCAA TCAGGGCCCC 5340 

CTGCAAATAC TTTCTCAAAA CCATGATAAA GGTAGAAGGT TGAGCTGGAT TTTCAAAAGT 5400 

CGTTTGGGTC AGCTGAATGC GTCCAAAAAC TGGATGGGCA GAAAGGAGCA GGCGATGGCT 5460 

TTGGCGATTG CTGCGGATTT GCAAGACCAA CTCTTGTTCA AAAGGCTGAT TGATTTTCTG 5520 

GATGCGACCA TTCACTAATT CGCTTCGCAA TTCCTCAACT ATGTGGTGTA AAAAAAATCC 5580 

GTCAAATGAC ATCGTTCTCT CCTTGTGATT GTATTCCATA GTATTATATC AAAAAGGTAG 5640 

AATAAAATCA TGGAAATGTG GTATAATAAA GCCAAGTAAA GAGAAACGAG AAGCACATGT 5700 

ATATTGAAAT GGTAGATGAA ACTGGTCAAG TTTCAAAAGA AATGTTGCAA CAAACCCAAG 5760 

AAATTTTGGA ATTTGCAGCC CAAAAATTAG GAAAAGAAGA CAAGGAGATG GCAGTCACTT 5820 

TTGTGACCAA TGAGCGTAGT CATGAACTTA ATCTGGAGTA CCGTAACACC GACXGTCCGA 5880 

CAGATGTCAT CAGCCTTGAG TATAAACCAG AATTGGAAAT TGCCTTTGAC GAAGAGGATT 5940 

TGCTTGAAAA TTCAGAATTG GCAGAGATGA TGTCTGAGTT TGATGCCTAT ATTGGGGAAT 6000 

TGTTCATCTC TATCGATAAG GCTCATGAGC AGGCCGAAGA ATATGGTCAC AGCTTTGAGC 6060 

GTGAGATGGG CTTCTTGGCA GTACACGGCT TTTTACATAT TAACGGCTAT GATCACTACA 6120 

CTCCGGAAGA AGAAGCGGAG ATGTTCGGTT TACAAGAAGA AATTTTGACA GCCTATGGAC 6180 

TCACAAGACA ATAAACGAAA ATGGAAAAAT CGTGACTTGA TATCCAGTTT AGAATTTGCT 6240 
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TTGACAGGTA TTTTTACTGC TATCAAGGAA GAACGCAATA TGCGAAAACA CGCAGTGACG 6300 

GCTCTAGTGG TCATCCTTGC AGGTTTTGTT TTTCAGGTGT CACGAATCGA ATGGCTCTTT 6360 

CTCCTATTGA GTATTTTCTT GGTAGTAGCC TTTGAGATTA TCAACTCTGC TATTGAAAAT 6420 

GTGGTGGATT TGGCCAGTCA CTATCACTTT TCCATGCTGG CTAAAAATGC CAAGGATATG 6480 

GCGGCCGGCG CGGTATTAGT GGTTTCTCTT TTCGCAGCCT TAACAGGCGC ATTGATTTTT 6540 

CTCCCACGAA TCTGGGATTT ATTATTTTAA ACAGTAAGAG GAAATTATGA CTTTTAAATC 6600 

AGGCTTTGTA GCCATTTTAG GACGTCCCAA TGTTGGGAAG TCAACCTTTT TAAATCACGT 6660 

TATGGGGCAA AAGATTGCCA TCATGAGTGA CAAGGCGCAG ACAACGCGCA ATAAAATCAT 6720 

GGGAATTTAC ACGACTGATA AGGAGCAAAT TGTCTTTATC GACACACCAG GGATTCACAA 6780 

GCCTAAAACA GCTCTCGGAG ATTTCATGOT TGAGTCTGCC TACAGTACCC TTCGCGAAGT 6840 

GGACACTGTT CTTTTCATGG TGCCTGCTGA TGAAGCGCGT GGTAAGGGGG ACGATATGAT 6900 

TATCGAGCGT CTCAAGGCT6 CCAAGGTTCC TGTGATTTTG GTGGTGAATA AAATCGATAA 6960 

GGTCCATCCA GACCAGCTCT TGTCTCAGAT TGATGACTTC CGTAATCAAA TGGACTTTAA 7020 

GGAAATTGTT CCAATCTCAG CCCTTCAGGG AAATAACGTG TCTCGTCTAG TGGATATTTT 7080 

GAGTGAAAAT CTGGATGAAG GTTTCCAATA TTTCCCGTCT GATCAAATCA CAGACCATCC 7140 

AGAACGTTTC TTGGTTTCAG AAATGGTTCG CGAGAAAGTC TTGCACCTAA CTCGTGAAGA 7200 

GATTCCGCAT TCTGTAGCAG TAGTTGTTGA CTCTATGAAA CGAGACGAAG AGACAGACAA 7260 

GGTTCACATC CGTGCAACCA TCATGGTCGA GCGCGATAGC CAAAAAGGGA TTATCATCGG 7320 

TAAAGGTGGC GCTATGCTTA AGAAAATCGG TAGCATGGCC CGTCGTGATA TCGAACTCAT 7380 

GCTAGGAGAC AAGGTCTTCC TAGAAACCTG GGTCAAGGTC AAGAAAAACT GGCGCGATAA 7440 

AAAGCTAGAT TTGGCTGACT TTGGCTATAA TGAAAGAGAA TACTAAGTAG AGGTAGGCTC 7S00 

ATGCCTGCTT CTTGTTTTTA CAGAAGGAGG ACTTATGCCT GAATTACCTG AGGTTGAAAC 7560 

CGTTTGTCGT 6GCTTAGAAA AATTGATTAT AGGAAAGAAG ATTTCGAGTA TAGAAATTCG 7620 

CTACCCCAAG ATGATTAAGA CGGATTTGGA AGAGTTTCAA AGGGAATTGC CTAGTCAGAT 7680 

TATCGAGTCA ATGGGACGTC GTGGAAAATA TTTGCTTTTT TATCTGACAG ACAAGGTCTT 7740 

GATTTCCCAT TTGCGGATGG AGGGCAAGTA TTTTTACTAT CCAGACCAAG GACCTGAACG 7800 

CAAGCATGCC CATGTTTTCT TTCATTTTGA AGATGGTGGC ACGCTTGTTT ATGAGGATGT 7860 

TCGCAAGTTT GGAACCATGG AACTCTTGGT GCCTGACCTT TTAGACGTCT ACTTTATTTC 7920 

TAAAAAATTA GGTCCTGAAC CAAGCGAACA AGACTTTGAT TTACAGGTCT TTCAATCTGC 7980 

CCTTGCX:AAG TCCAAAAAGC CTATCAAATC CCATCTCCTA GACCAGACCT TGGTAGCTGG 8040 
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ACTTGGCAAT ATCTATGTGG ATGAGGTTCT CTGGCGAGCT CAGGTTCATC CAGCTAGACC 8100 

TTCCCAGACT TTGACAGCAG AAGAAGCGAC TGCCATTCAT GACCAGACCA TTGCTGTTTT 8160 

GGGCCAGGCT GTTGAAAAAG GTGGCTCCAC CATTCGGACT TATACCAATG CCTTTGGGGA 8220 

AGATGGAAGC ATGCAGGACT TTCATCAGGT CTATGATAAG ACTGGTCAAG AATGTGTACG 8280 

CTGTGGTACC ATCATTGAGA AAATTCAACT AGGCGGACGT GGAACCCACT TTTGTCCAAA 8340 

CTGTCAAAGG AGGGACTGAT GGGAAAAATC ATCGGAATCA CTGGGGGAAT TGCCTCTGGT 8400 

AAGTCAACTG TGACAAATTT TCTAAGACAG CAAGGCTTTC AAGTAGTGGA TGCCGACGCA 84 60 

GTCGTCCACC AACTACAGAA ACCTGGTGGT CGTCTGTTTG AGGCTCTAGT ACAGCACTTT 8520 

GGGCAAGAAA TCATTCTTGA AAACGGAGAA CTCAATCGCC CTCTCCTAGC TAGl'CTCATC 8580 

TTTTCAAATC CTGATGAACG AGAATGGTCT AAGCAAATTC AAGGGGAGAT TATCCGTGAG 8640 

GAACTGGCTA CTTTGAGAGA ACAGTTGGCT CAGACAGAAG AGATTTTCTT CATGGATATT 8700 

CCCCTACTTT TTGAGCAGGA CTACAGCGAT TGGTTTGCTG AGACTTGGTT GGTCTATGTG 8760 

GACCGAGATG CCCAAGTGGA ACGCTTAATG AAAAGGGACC AGTTGTCCAA AGATGAAGCT 8820 

GAGTCTCGTC TGGCAGCCCA GTGGCCTTTA GAAAAAAAGA AAGATTTGGC CAGCCAGGTT 8880 

CTTGATAATA ATGGCAATCA GAACCAGCTT CTTAATCAAG TGCATATCCT TCTTGAGGGA 8940 

GGTAGGCAAG ATGACAGAGA TTAACTGGAA GGATAATCTG CGCATTGCCT GGTTTGGTAA 9000 

TTTTCTGACA GGAGCCAGTA TTTCTTTGGT TGTACCTTTT ATGCCCATCT TCGTGGAAAA 9060 

TCTAGGTGTA GGGAGTCAGC AAGTCGCTTT TTATGCAGGC TTAGCAATTT CTGTCTCTGC 9120 

TATTTCCGCG GCGCTCTTTT CTCCTATTTG GGGTATTCTT GCTGACAAAT ACGGCCGAAA 9180 

ACCCATGATG ATTCGGGCAG GTCTTGCTAT GACTATCACT ATGGGAGGCT TGGCCTTTGT 9240 

CCCAAATATC TATTGGTTAA TCTTTCTTCG TTTACTAAAC GGTGTATTTG CAGGTTTTGT 9300 

TCCTAATGCA ACGGCACTGA TAGCCAGTCA GGTTCCAAAG GAGAAATCAG GCTCTGCCTT 9360 

AGGTACTTTG TCTACAGGCG TAGTTGCAGG TACTCTAACT GGTCCCTTTA TTGGTGGCTT 942 0 

TATCGCAGAA TTATTTGGCA TTCGTACAGT' TTTCTTACTG GTTGGTAGTT TTCTATTTTT 9480 

AGCTGCTATT TTGACTATTT GCTTTATCAA GGAAGATTTT CAACCAGTAG CCAAGGAAAA 9540 

GGCTATTCCA ACAAAGGAAT TATTTACCTC GGTTAAATAT CCCTATCTTT TGCTCAATCT 9600 

CTTTTTAACC AGTTTTGTCA TCCAATTTTC AGCTCAATCG ATTGGCCCTA TTTTQGCTCT 9660 

TTATGTACGC GACTTAGGGC AGACAGAGAA TCTTCTTTTT GTCTCTGGTT TGATTGTGTC 9720 

CAGTATGGGC TTTTCCAGCA TGATGAGTGC AGGAGTCATG GGCAAGCTAG GTGACAAGGT 9780 
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GGGCAATCAT CGTCTCTTGG TTGTCGCCCA GTTTTATTCA GTCATCATCT ATCTCCTCTG 9840 

TGCCAATGCC TCTAGCCCCC TTCAACTAGG ACTCTATCGT TTCCTCTTTG GATTGGGAAC 9900 

CGGTGCCTTG ATTCCCGGGG TTAATGCCCT ACTCAGCAAA ATGACTCCCA AAGCCGGCAT 9960 

TTCGAGGGTC TTTGCCTTCA ATCAGGTATT CTTTTATCTG GGAGGTGTTG TTGGTCCCAT 10020 

GGCAGGTTCT GCAGTAGCAG GTCAATTTGG CTACCATGCT GTCTTTTATG CGACAAGCCT 10080 

TTGTGTTGCC TTTAGTTGTC TCTTTAACCT GATTCAATTT CGAACATTAT TAAAAGTAAA 10140 

GGAAATCTAG TGCGAGTAAA AATCAATCTC AAATGCTCCT CTTGTGGCAG TATCAATTAC 10200 

CTAACCAGTA AAAATTCAAA AACCCATCCA GACAgATTGA 10240 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13206 base pairs 

(B) TYPE:- nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CGCTTTATCG TGGACGTGGT CAAGCCGAGA ATTTCATCAA GGAGATGAAG GAGGGATTTT 60 

TTGGCGATAA AACGGATAGT TCAACCTTAA TCAAAAACGA AGTTCGTATG ATGATGAGCT 120 

GTATCGCCTA CAATCTCTAT CTTTTTCTCA AACATCTAGC TGGAGGTGAC TTCCAAACTT 180 

TAACAATCAA ACGCTTCCGC CATCTTTTTC TTCACGTG&T GGGAAAATGT GTTCGAACAG 240 

GACGCAAGCA GCTCCTCAAA TTGTCTAGTC TCTATGCCTA TTCCGAATTG TTTTCAGCAC 300 

TTTATTCTAG GATTAGAAAA GTCAACCTGA ATCTTCCTGT TCCTTATGAA CCACCTAGAA 360 

GAAAAGCGTC GTTAATGATG CATTAAAGAA CAGTCGAGAT GAAAAAATCG TGTGACGCAC 420 

CAAGGGAGGA GTCTGCCCTT TTGAGGAAAT CTAGCGAGGA AAAACGATAC TGGAACAGCA 480 

GAAAGTAAAA CTGACCTCAT GAGGAGGAAG AAAGTGGCTC ATGAGGTCAG GGGTTTTG'^A 540 

AGTTACATCT AGTTGAGAGA GGTATGAATG ATTTGGGATT AATCATTTCT TGTTTTAAAT 600 

CAGGAGAATA GTAACGATTT TTTCCTTTTT TGACGAACTC TATTCCGTAA CGATCAATCA 660 

ATTTAATCAT GTACCTAATA TTAGAATTGT TTATCCCAAA TTTATTTGAA AGCTTCTCTA 720 

AGCTATATCC TTGTTTTCTA AGTTCATAGA TCTGAACTTT ATCATCATAA GTTAGTTTCA 780 

TAATAAAAAC ACCCCAAAAG TTAGATTTTT TCTGTCTAAC TTTTGGGGGG CAGTTCATTC 840 

AACACCTGAT ACTATGCGTT TTTCTTATTT GAAATACTTT TTACTCAACC TCTTTATACT 900 

CAATGAAAAT CAAAGTGCAA ACTAGAAAGC TAGCCTCAGG CT6CTCAAAA CAGTGTTTTG 960 
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AGGTTGCAGA TGGAAGCTGA CGTGGTTTGA AGAGATTTTC GAAGAGTATT ACTTAATCTT 


1020 


CTTGATACTT 


TGACTAAGAA TAAATCCTAC 


AATCATCCCT 


ACCATATTTT GCATAAAATT 


1080 


CGGTAGAATT 


TCTGGGAGGG CTGCTGCCCA 


GCCATTCATC AAAGCAGAAC CCAAGGCGTA 


1140 


GCCTCCTACC 


ATGGCAATAG TTGCTAAAAT 


AAGGCCTAAC 


CACTGACTTT TTCXTTTTAAA 


1200 


TCCTGCGAAA AATCCCTGCA AGCCATGGTT 6ACCAAGCTA AAGAACATCC ACTGAGGGTA 


1260 


GCCTGATAAG 


AGGTCAATCA AGAAACTTGC 


TAGTCCTCCG 


ACTACCGCTC CTTCACGACT 


1320 


ACCAAAGTAA 


AAGGCCGCAA AGAAGACACC 


AGCATCTAAA 


AGAGTTAGAA TTCCTGTAGG 


1380 


TGTTGGGATT 


TTTAAGAAAT AACCTAGAAC 


CACAGAAAGG 


GCGGTTAATA GGGATACAAG 


1440 


GGCGATTTTA 


GTTGTTTTTG TTTGCTTCAT 


ATTGTCTTAC 


TCCATACTGA TCTGCTTGTG 


1500 


CAATAGCACG 


ATAAACGAAA GCCTTAGAGC 


TTTCTACTGC 


TGGCAAAAGT TTATCACCTT 


1560 


TAACCAGGTG 


ACTGGCAATG CTAGAGsCAA 


AGGTACAACs 


TGCACCAGCA TTTTGGCCTT 


1620 


GGATAACTGG 


ATTTTCTAGG ATAGTAAAGG 


TCTGTCCATC 


ATAAAAGACA TCCACA6CCT 


1680 


TGTCCTGACT 


AAGACGATTG CCTCCCTTGA 


TAATGACTGt 


GGCGCTCCTA AATCATGCAA 


1740 


TTTCTGCGCT 


GCAGTTTTCA TGTCTTCCAA 


GGTTTTAATT 


TCCTGACCGG AT7U\TAATTC 


1800 


TGCTTCTGGG 


AGATTAGGCG TAATCACACT 


GACATAAGGG 


AAAAAGCGAA TCAACTCTTG 


1860 


GCAGAGCTCA 


CTGACAGCTA CATCATGCGT 


TTCCTTGCAG 


ACCAAGACAG GATCCAACAC 


1920 


CACAGGTACT 


CCTGGGCGTT GTTTGATAAA 


GTCCAAGGCC 


TTCTCAGCCA CX3CTGACAGT 


1980 


AGGGAGAAGA 


CCAATCTTAA TTCCCCCAAA 


TTCCACATCA 


CGCAAGCTAT CTAATTCATG 


2040 


TTGAAAAATG 


GTATCATCAG TTGGAAAGAC 


TTCAAATCCT 


TTTTCTGTCA AGGCTGTCAA 


2100 


ACAAGTCACT 


GCTACAAACC CATGCAAGCC 


GTTCAAGGTA 


TAQGTAGCCA AATCAGCTGA 


2160 


CAGTCCACCA 


CCACTAAAAA TATCATTTCC 


AGAAAGTGCT 


AAAATACGAT TATTCTTCAT 


2220 


AACGAATCTC 


CTTTAAATAC AAACCATTTG 


GTGCTGCAGT 


GGGACCTGCA AGTTGCCTGT 


2280 


CCTTCTTCTC 


CAAGATGAGA TCAATCTGCT 


CTACTGGCAT 


GCGGTTGTTA CCGATTTTGA 


2340 


GAAGAGTCCC 


CACCATATTG CGAATCTGTT 


TATACAAGAA 


ACCATTTCCT GAAAAGGTAA 


2400 


AGGTCAAAAA 


TTGTCCTGTC TCATCGACTA 


TTAAACTAGC 


TTCTGTGATG GTGCGAACCT 


2460 


TATCCTCTAC ACTAGTCCCA GAGGCTGTAA 


AACCGGTAAA 


ATCATGGGTT CCCTCTAGCT 


2520 


TTTTGATTGC 


AATCTGCATT CGTTCCACAT 


CGAGTGGGTA GGGAAAGTGG GTGGCATAGT 


2580 


GACGGCGCAT 


CGGATTTTTG GGACGTCCTC 


TATCCACAGT 


AAACTCATAG GTCTTGCTAT 


2640 


GCTTGGCATA ACGGCAATGA AAATCATCTG 


CCACAAGCTC 


AATCGAAATC ACATCAATAT 


2700 
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CTTCAGGAGA CTGGGTATCC AAGGCAAAAC GGAGTTTCTC CTCATCCATC TGATAAC3GCA 2760 

GGTCAAAATG AATCACCTGT CCCAGGGCAT GAACCCCACT ATCTGTCCTA CCAGCACCGT 2820 

GAACAGTAAT GGCTTGCCCT TTATTTAATC TGGTCAAGGT TTTTTCAATT TCTTCCT6AA 2880 

CGCTACGCGC ATGAGGCTGG CGCTGAAAGC CAGCAAAGGC ATAACCATCA TAGGAAATAG 2940 

TTGCTTTATA TCTCGTCATA GCCTCTATTT TATCAAGAAA TTAGTCTGTA AACAAGGACC 3000 

TAAAACAAAT ATTGTATGGG TATAAAAATC TCATACTCTT CGAAAATCTC TTCAAACCAC 3060 

GTCAGTTTCC ATCTGCAACC TCAACACACT ATTTTGAGCA ACCTGCGGCT AGCTTTCTAT 3120 

AGTAGATTGA AATAAGATAT GAACAACTCT ATTAG6AAAG TCAAATTAAT TTCTAGAAAT 3180 

ATTTTAGCAG CTACAGCGTA CTATTCCAAA CTCAATCAAC TATAGTTTGC TCTTTGATTT 3240 

TCATTGAGTA TCAAAAGAAA AACTTAG6AA TCAATCCTAA GCTCTCTTCT GAAGTAGGTA 3300 

CATGACAAAG ATAGAGATTA CAATCAACCA ACCTCCTAAG ATACTAAAGA CCAACATCCC 3360 

ATTGTGAGTT AGTAAGCCAA TTGCACCTAG AACGAATGGG GTCGTAAAGG CTCCGAAACT 3420 

ACAGCCTAAT ACAGCAAATG AAGTTGCTTG ATTGAGGAGT TTAGCTGGAA TTCGTTCAGA 3480 

GACAAGTTGA AAGACCGTCG TCAAGACTAC ACTATAGGCA AATCCAGCCA GAACACTTCC 3540 

TGCTACTACC ACCCACAAGG ATGAAGACAA GGCAATCACG ATTTGCCCCA AGCCAAAGGT 3600 

AATACCAGAC CAGAGGAGCA GTTTCTCTTT AAAGATAGAA ATCAAGAAAG AAAAACTCAC 3660 

CCCAGCCACA ATCCCGATCA ACTGCATGAT ACTAAGAACA AAACTAGATA ACTGGGCATC 3720 

CCCCAATCCT CTTTCCACCA TCAAACTTGG AATACGGATG GTAATAGCTG TATTGGTACA 3780 

AACTACAACT GCCGCTTCGA TAGCTAAGGT AAAAATCAAG CCTTTCATTT CTCGAGTTAA 3840 

ACGACTTGCT TCCTTCGCTC TTTTCTTGAC TTCTTTCTTT GATTTTCCAT AAGGGACAAA 3900 

GAGCAGATAA AGGGGCAGCA CCAAAAATCC AGCACTATA6 GCTAGAAAGA TAGCTGTCCA 3960 

ACCAAAGGCC AACAACTGAC CGACGGCCAA GGTAATGAGA GAAGCTCCAA CGACCTCTGC 4020 

AGAAGCGCGT AGCCCTAACA TCTGAATTCG CCTTTTTCCT TGGTAGCGTT CACTGATAAT 4080 

AGAAATGGCC TTGGCATTGA TCATCCCAAG ACCCAAACCA AAGAGAAGCC GTGTTCCAAA 4140 

GACAAAGGGA TAGGCTTGGT ACCAGAAGGG AGCTGTACCG CTCAATGATA AAATCAGCAA 4200 

GCCCAAACTA ATCTGTAAGC GCTCAGGAAA TATTTTTTCT AAGAAACCAT TTAGCAGTAA 4260 

CATCATCATG ATTCCAAAGG AAGGCAAGCT CACCAAGAGC TCAATTTGTT CCTTAGAATA 4320 

ACCCTGATAA TAGTCAAACA TGGCTGGTAG GGCACTCGAA ATGGAAAAGG AGGTAATCAA 4380 

AACGAGGGAG AGAGCCAAAA TGCTGGCCCG TTCTAAAAAT TGTTTCATGA AATCTCTTTC 4440 

TATATTTCTC TTAATCTTCT ACrTTTTTGA TAOTTATCAA ATAAGCAAGA AAAGAAGAAG 4500 
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CCTCATTGGT TTGTAGACTC CTTCTTAAAT TCGAAAATGA ATCCCTTGTA TCTTATACTC 4560 

AATGAAAATC AAAGAGCAAA CTAGGAAGCT AGCCGCAGGT TGTTCAAAAC AGTGTTTTGA 4620 

GGTTGCAGAT GGAAACTGAC GTGGTTTGAA GAGATTTTCG AAGAGTATTA GGATGACTTT 4680 

CTCTTGATTT GCTTGATAAA GTAGAAAATA AATCCTGCTA CCATATAGGC AACAAAGATA 4740 

ATCAGACACC ACTTAAACAC AACATTCCAA CCCTTGTTCA CATTCAAAAA GAAGTAAGGG 4800 

AAAGGATTAT CCTTGGCATT TGGAATATTG AGTTTTAGAA CCAAGCCATT AAAAAGAGCA 4860 

AACATCATAT ACA6AAAGGG TAAAATGGTC CACACTGCTG GATCCCAAAT CTTGTATTGA 4920 

CCCTGTTTGT CAAAAAAGAG GGTATCCGCT AAAAACCAGA TGGGAACGAT ATAGTGGCAA 4980 

AGGAAATTTT CTAGGGTATA GAAATTAGTC GCAATGGGCG CCAAGAGGAA ATGGTAAATC 5040 

ACACAGGTAA TCATGATACT CATGGTGACC CCACCTTTTA AGCX3CAAGAG ACTTGGCCTT 5100 

TGCCAATTTT CACCTACACG GCTCATAACC TTTAGAAGAT AAAGGGTAAA AATAGTTACC 5160 

AAGAGGTTOG ACAGAACCGT GTAATAGAGA AGCATCCCAA AACXTACCATG CTTAGTAATT 5220 

TCAAGATAAA CTCCCGTAAA A6CCGCTA6A AACAAGAAGA TACGGCTATA AAATACAAGT 5280 

TTATAGTGTT TTGACATGCT TAAATCTTCC TCACAAACTC TGATTTAAGT TTCATGGCAC 5340 

CAAAACCATC AATCTTACAG TCGATATTGT GGTCGCCTTC TACGATGCGG ATATTTTTCA 54 OO 

CGCGCGTCCC TTGTTTCAAA TCTTTTGGCG CACCTTTTAC TTTCAAGTCC TTGATGAGAG 5460 

TTACTGTATC ACCATCAGCC AATTTATTTC CGTTGGCATC GATAGCGACA AGACCTTCTT 5520 

CTACTTCTGC AACTTCAGCA GGATTCCACT CATGAGCACA CTCTGGGCAA ACCAGTAGGG 5580 

CACCGTCTTC GTAGACATAC TCTGAGTTAC ATTTTGGACA ATTTGGTAAA TTGTTCATGG 5640 

TTTCTCCTTA TCATCATTCA CTATTCTTTG AAAATCAAAA TTTCTCGAAC AGCAACTATT 5700 

ATACCCTAAA ATCAGCATTT TGACAAATTT AGAAAAAAAC CGATATCAAT CTATCGGCTT 5760 

TTCTACATTT ACATTCTTTT TTCAGCTTCT GCTTTGATTT TTTCAACTAC TTCTTGAATG 5820 

TTCAAACCAG TTGTATCAAG GTAGACAGCA TCCTCTGCTT GTTTGAGAGG AGAAGTCTCA 5880 

CGATGACTAT CCTTGTAGTC ACGCGCAGCA ATTTCCTTTT TTAGGGTTTC AAGGTCTGTT 5940 

TCAATTCCCT TGGCAATATT TTCCTTGTAA CGACGCTCTG CTCTCTCATC AACAGAAGCT 60 00 

ACTAGGAAAA TTTTCAATTC TGCTTGTGGC AATACAACAG TTCCAATATC GCGACCATCC 6060 

ATGACAATCC CGCCTTGCTG GGCAATTTCT TGTTGGAGAG AAACCAGTTT CTCACX3CACT 6120 

TGAGGAATTG CTGCAATAGC AGAAACATGA TTGGTCACTT CATTTTCACG GATAGGATGG 6180 

GTAATATCCA CATCTCCTAC AAAAACAAGC TGGTCTCCAG TTTCTGAACG TCCAAAGCTG 6240 
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ATTGGATGCT 


GGTCCAACAA 


GGCTAGAAGG 


GCTTCGACTT 


CTTCAACTCC 


TAATTGGTTC 


6300 


TTAAGAGCCA 


TATAGGTCGC 


TGCACGATAC 


ATAGCTCCTG 


TATCAAGGTA 


GGTGAATCCA 


6360 


AAATCCTTAG 


CAATAATCTT 


TGCGACCGTA 


CTCTTACCGC 


TGGAAGCAGG 


ACCATCAATA 


6420 


GCAATTTGAA 


TTGTTTTCAT 


ATCGGCTCCT 


ATTTTATTTT 


TATAACATCA 


CCTGGATTAG 


6480 


CAAACCAAGA 


TCCTGTAGCC 


ATGTGCCCAG 


GATTCAAGGC 


CTCTAACTGA 


GCAATGGAGA 


6540 


TTCCTGCACG AGCGGCAATA GCTGCTTCCC 


CTTCTCCTGC 


GAGAACTTTA 


ATCGTTCCTT 


6600 


CAGGATTAGC 


AGCTTCTTCT 


GAACTACTAG 


AAGTAGATTC 


TGGCTCTGAA 


CTCTGCTCAG 


6660 


GCTGAGAACT 


ACTTGAAGAT 


GAGATTTGTA 


CTACACTGGC 


ATCAGAATCA 


TGAAAGCCTT 


6720 


TTAAGGCTGC 


TGTGCGATTA 


CTCCCCCCCG 


ATGATAGATA 


GATGAGAACG 


ATGACCATCA 


6780 


CCACCACAAT 


TACAAAGAAA 


ATACTAGCTA 


GGATCGTCAA 


AATACGATTA 


GCCATCCTAT 


6840 


CAGCCCCTCC 


GTGGTTTCGA 


TGCCGACGCT 


CTGCTCTTGA 


TTCTTCTTGA 


TCATAGATAT 


6900 


CTTCTTGCCA 


CGGTTCTTTT 


GCCATACCTT 


ACTCCTTGTT 


TTTTTTTACT 


TTTCTTATTA 


6960 


CAATATAAAT 


ATGAACATGA 


AAATCACACT 


TATACCTGAA 


CGATGTATCG 


CCTGTGGGCT 


7020 


TTGCCAAACT 


TATTCTGATT 


TATTTGATTA 


CCACGATAAT 


GGAATCGTGC 


GTTTTTACGA 


7080 


TGACCCTGAC 


CAACTGGAAA 


AAGAAATTTC 


TCCTAGTCAG 


GATATCTTAG 


AGGCTGTTAA 


7140 


AAATTGCCCA 


ACTCGCGCCC 


TGATTGGAAA 


CCAGGAAGCC 


TAAATCAATG 


GCGATAATCC 


7200 


ACTCCCTCTA 


GTTTAGCACA 


TTTCCATGTA 


AAATTATAGT 


CTTTTCACTT 


TATTTTTTTC 


7260 


TGTAAAATCA 


GGAAGGTCAC 




GATAAGATAA 


AGTGGTCTTT 


TTTTAGTCTC 


7320 


TAAATAAATC 


TTACTGATAT 


ACTTGCCGAG 


AATCCCAATG 


GTCAAGAGTT 


GAATCCCTCC 


7380 


AAGAAAGAGA ATAACAGCCA TCAGAGAGGT 


CCAACCAGAT 


GTCGGATTGC 


CCAAAAT6AG 


7440 


GGTCCGAACC 


ACAACAAAAA 


AGGTCATCAG 


CAGAGAAAGA 


AAACAAGATA 


GGAGACCAGC 


7500 


TACAAAGGCT 


ATAATCAAGG 


GAAAATCTGA 


AAAATTAATA 


ATCCCTTCAA 


TGGAGTAGAA 


7560 


AAAGAGTTGC 


CTAAAACTCC 


AACTTGTCTT 


GCCAGCCTGC 


CTTTCGACAT 


TTGGATAGTC 


7620 


CAAATAGTAG 


GTTTTGAAAC 


CCACCCAGGC 


GAAGAGCCCC 


TTTGAAAAAC 


GATTGGACTC 


7680 


GGTCAAGCTT 


AAAATGGCAT 


CGACTACAGA 


CCTTCTCATC 


ATACGAAAAT 


CACGGACACC 


7740 


CGACGGCAGA GCTACTGGGC 


TGATTTTTTG 


CATGAGGCGA 


TAAAAGAGAA 


CAGCACAGAA 


7800 


ACTGCGAAAG AAGGGTTCTC CCTCCCGACT 


AGTTCTCCGT GTCCCAACGC 


AGTCCAAGTC 


7860 


TACATTTTTG 


TCTAATACAT 


TTTTCATCTC 


AAACAACATA 


CTAGGAGGAT 


CTTGGAGGTC 


7920 


TGCATCCATC ACCACCACCA AATCTCCTGT 


CGCATATTGC AAGCCTGCAT AAAGGGCTGC 


7980 


TTCTTTGCCA AAATTTCGAG AGAAAGAAAT 


ATAATGGACT GCCGGATTTT 


GCTCCCGATA 


8040 



wo 98/18931 



PCT/US97/19588 



337 



GGCCTTTAAG 


AGTTCCAAGG 


TCCCATCACT 


TGATCCATCA 


TCGACAAAGA 


CATACTCGAT 


8100 


TTCTGTTTCC 


AAATCTGGAA 


GTAAAGCTTC 


CAGAGCCTGA 


TAAAAAAGAG 


GAAGTACTTC 


8160 


CTCTTCGTTT 


AAACAAGGGA 


CGATGATTGA 


AATCATCATC 


TTAGTCTTCA 


AATCCATTTG 


8220 


GATGCTTGCP 


TTGCCAACGC 


CATGCGTCTT 


CACACATTTG 


GGTGATGTCG 


AGTTCTGCTT 


8280 


CCCAACCGAG 


TTCTGCTTTA 


GCTTTTGCCG 


GGTCTGAGTA 


GCAGGCAGCG 


ATATCACCTG 


8340 


GGCGACGTTC 


TACGATGCGG 


TAAGGAATAG 


GACXSGCCCAC 


CGCTTTTTCC 


ATGTTTTGGA 


8400 


TAATTTCAAG AACTGAGTAA CCTTTACCAG TTCCAAGGTT 


ATAAACGTTT 


AGTCCTGAAC 


8460 


CTTTTTGGAT 


TTTTTTCAAA 


GCTGCAACGT 


GACCCTTAGC 


CAAATCGACA 


ACGTGGATAT 


8520 


AGTCACGAAC 


ACCTGTTCCA 


TCTTCCGTAT 


CGTAATCGTC 


TCCAAACACT 


TGCACTTGCT 


8580 


CTAATTTTCC 


AACGGCTACT 


TGAGTCACAT 


ATGGCAAGAG 


ATTGTTTGGA 


ATACCGTTTG 


8640 


GATTTTCTCC CAAATCACCA CTCTCATGGG CTCCGATTGG GTTAAAGTAA CGAAGCAAGA 


8700 


CAACATTCCA 


TTCTGAGTCT 


GCTTTGTAAA 


TATCAGTCAA 


AATTTCCTCT 


AGCATGAGCT 


8760 


TAGTACGACC 


GTATGGGTTG 


GTCACTGAAA 


GTGGGAAATC 


TTCCAAGATG 


GGCACTGTGT 


8820 


GCGGATCCCC 


GTAAACTGTC 


GCAGAAGAAC 


TGAAGATGAT 


GTTTTTACAG 


TTGTTTTCTT 


8880 


CCATGGCTTT 


CAAAAGGCTG 


ACAGTTCCAG 


CGATATTGTT 


GTCATAGTAG 


GCAAGAGGGA 


8940 


TACGTGTTGA 


TTCGCCAACA 


GCCTTCAAAC 


CAGCAAAGTG 


AATGACACCA 


GTCGGTTCTT 


9000 


CCTGCTTGAA 


AATATCTCTG 


AGGGTATCTG 


TGTCACGAAT 


ATCTGCCTCA 


TAGAAAGGAA 


9060 


TCTCAACTCC 


TGTGATTCCT 


TCAACAACTT 


CTAAACTCTT 


ACGATTGCTA 


TTGACAAGAT 


9120 


TATCCACCAC 


AACAACTTGA 


TGACCTGCTT 


GGATCAATTC 


AATAACAGTG 


TGGGTTCCAA 


9180 


TAAAACCGGC 


ACCACCAGTT 


ACCAAAATCT 


TTTCTTGCAT 




CGATTCTCAG 


9240 


ATTATTTTTT 


CTPTATTTTAC 


CATTTTTGAC 


AGGGAATGTC 


ATTTGCCATC 


CTAAACTACC 


9300 


TGATAAAATT 


TCAGTAAAAT 


GCTTATACTC 


TTCGAAAATC 


CAATTCAAAC 


TACGTCAACG 


9360 


TCGCCTTGCC 


ATGGGTATGG 


TTACTGACTT 


CGTCAGTTCT 


ATCCACAACC 


TCAAAACAGT 


9420 


GTTTTGAGCT 


GACTTCGTCA 


GTTCTATCCA 


CAACCTCAAA 


GCAGTGCTTT 


GAGTAACCCG 


9480 


CGGCTAGTTT 


CCTAGTTTGT .TCTTTGATTT 


TTATTGAGTA 


TTATTCGCTT 


TTTACTCGTT 


9540 


TGACATAGTT 


TTCAATTGGG 


TAATTTAGAG 


GGTCCAAGGT 


CAACTCCTTG 


TCTTGGATCA 


9600 


GTTG66CTAG 


ATGGTAACCA 


ATGATAGGAC 


CAGTTGTGAG 


GCCT6ATGAA 


CCTAGTCCAC 


9660 


TGGCTGCATA 


GACACCAGTT 


AAGTCAGGCA 


CCTGCCCAAA 


GAAAGGAGAG 


AAATCACTGG 


9720 


TGTAGGCACG 


GATTCCAACA 


CGCTCAGATT 


TTGAAGTAGC 


TTCAGCCAAA ATCAGATAGT 


9780 
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GAGTCAAGGT GGCCTCCTCC ATTTGTTGGA GCAAGGTTrC ATCTACCGTC AAATCAAATC 9840 

CCATGTCATT TTCGTGGGTA GCGCCTAAGG ATAATTTCCC ACCTGCAAAG GGAATCAAAT 9900 

CCCACTCCCC TTCTGGCATG ACAACAGGGT AATCTTCCAT GTCTTGGGCA AGCTGATAAT 9960 

CTCGTAGTTG TCCTTTTTGA GGACGGACAT CCACTTCATA ACCTAAAGGC TCTAACATGT 10020 

CCCCCAACCA AGCTCCCGTC GCCAAAATAA CCTGCTCAAA CTCCTCTTCA CCAATCTGGT 10080 

AGCCTGATGC TAACGGTGTC AGAGTCACTT TTTCTTTGAC CAGCTTGACA TGACTGACTT 10140 

CCAGCAAACG AGTCACTAAA AGTTGGCCAT CTACTCTCGC TCCACCAGAA GCATAGAGCA 10200 

GGCGGTCAAA TCCCTGCAAA CCAGGGAATA ATTCATTAGC TGAGGCTTGG TTCAGAATGG 10260 

CTAATTGCCC TATCAAGGGA GATTCTTCTC TGCGCTGGAG GGCCAGTTGA TAAAGTTCTT 10320 

CCAAATTGGA TTCATCCTTT TTCAAGAGAA AGACTCCCGA ACGCTGGTAA AAGTCGATTT 10380 

CTTGTCCTGA TTTCTCTAAA TCAGCTAATA AATCCACATA AAAATCAGCC CCCAAGCGCG 10440 

CCATCTTGTA CCAGGCTTTA TTACGGCGTT TGGAAAACCA AGGACTGATA ATTCCTGCTG 10500 

CGGCCTTGGT GGCTTGACCT TGCTCATGGT CAAAAACGGT CACCTCTAGG TCACTTTCTC 10560 

TCGAGAGGTA GTAGGCAGCT GTTGCTCCCA CAATTCCTGC TCCAATAATG GCAACTTTTT 10620 

TCATTGTCTT CACTTTCTAA CTAGATATGA TGGAAAGGAT TGGTTGATGC CTGACTAGGC 10680 

AAGATATCAA TAGACCACCC CTTATCTTCC TTCCATTGAC TAAGAAGTGC TGCGATTTTT 10740 

TCTACAAAAA TCACTTCGAT ATAGTGACCT GGGTCCAATG CAAGCAACCC ATCAGATAGC 10800 

ATATCCTGAG CAGTATGGTA GTAGATATCA CCAGTGATAT AGACATCTGC CCCCTTTGCC 10860 

AAAGCATCCT TATAGAAAGA CT6CCCGCTT CCACCACAAA TTGCTACTCT TGAAATAGGC 10920 

TTCTGCAAAT CATCCTCTTG ATAATGCACC ATTCGAAGGC TATCTAGGTC AAAGACTTGC 10980 

TTGACCTGTT GGGCCAATTC CCAAAATGTC TGAGGCTGAA TATTCCCAAT ACGTCCAATT 11040 

CCACGTTCTG GACCTGTTTC CTGCAGATAA GTCGTCTCCT CGATTCCTAG CATCTGACAA 11100 

AACCAGTCAT TGAGCCCATT TTCAACGATA TCAATATTGG TATGGCTGAC ATAAACTGCG 11160 

ATATCATGCT TAATCAGGTC GATGTAAATC TGATTTTGCG GACGGCTGGC AAGCAAGTCC 11220 

TTGATAGGAC GAAAGATAGG CGCGT6CTTG ACGATAATCA AGTCCACACC CTTTTCAATG 11280 

GCCTCTGCCA CTGTCTCTTC ACGAATATCG AGGGCAACCA TGACCCTTTG GATACCCTTG 11340 

TCTAAAGTGC CAATTTGCAG ACXIACGGCTG TCTCCCTCCA TAGAAAATTC CTGAGGGCAA 11400 

AAGGCTTCAT AAGCTTGGAT CACTTCACTT GCTAACATGG AGCACCTCCT TGATAGCTTG 11460 

AATCTTATCT ACTAGAACTT GACGTTCTTC CAGATTTTTT TCTGGGATTT GTCCGAGGGC 11520 

GAACTCTAGC TTCTCAGCTT CTTTTTGCCA TTTTTGGACA AATACTGGAC TGACTTCTTT 11580 
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GGACAAGAAG GGACCAAAGC GAACATCACT GGCTGATAGC TTCATTTGTC CTGCTTCCAC 11640 

CACCAAAATC TCATAAAACT TTCCAGCTTC TTCTAAGATG CTTTCTGCTA CAATCTGGAA 11700 

TCCATGATCC TGTAGCCAGA TACGCAAGTC GTCTTCACGA TTATTGGGCT GGAGGATCAA 11760 

ACGCTCTACA TTAGCTAACT TCCCCAAACC TTCTTCTAAA ATCCTAGCAA TCAAACX5ACC 11820 

ACCCATGCCA GCAATGGTAA TGACAGACAC TTGGTCAGTC TCTTCAAAAG CTGCCAAGCC 11880 

ATTGGCTAAA CGGACTTGGA •rTTTCTCCTT TAGGCCGTGA GCCTCAACAT TTTTAACCGC 11940 

AGACTGATAG GGACCTTCCA CCACCTCACC TGCAATAGCG CTTTTGATTT GGCCTCTCTC 12000 

AACCAACTCG ATAGGCAGAT AAGCATGGTC ACTTCCCACA TCTAGTAAAA TAGCCCCCTG 12060 

TGACACAAAG GAAGCTACCA ATTCTAATCT CTTTGAAATC ATCTTCTCTC ACTTTCCAAA 12120 

ACTCTATTAC CTCTTATTAT ACXy^CATTTC AATCTTCAAC TTCCCAGTAA TATAAGCACC 12180 

TCTGGCGAAA GAAGTTTCAA TGTCCTAAAG TAATAAGTGA ATCCAATTGA AAGATTTTAA 12240 

ACAATTTGCA AAAATGTCAA AAAATAAAAA ATAAACAGTT TATTCAGAAA ATTCTTGACA 12300 

TATAAAAACA CATGGTAGAA TATAATTAGA AAGTTAGAAA AAATAAAAGT TTGACTAAAA 12360 

TTTGTATTTG AAGGTGGTGT TCAGATAAGA AATTTAGTCA GACX»ACCAC GAATTTGCTC 12420 

TATGCTTTCT GGAATTTATC ATAACAGGAG GATACAGTCA TGGAACAAAC ATTGTTTGAA 12480 

TTAGAACTAC TTCCAGAGGA AGATATCATT GTCACAGGTC TCCCTAAGTA TTGTTCTTTT 12540 

ACTTGTTTAA TTACAGGTCG CTAGTTATAT TTTATATAAA ATAAGTAGCT TTACTTACGG 12600 

AATAGGCTAG TGCTGTGTCT CTAGCCTATT TTAATAATTA GGAGTTTGTT ATGGATTTAT 12660 

TAGAGAAAGA ATGTTTAAAA TGTGATAAAA ATTTCCAACA GGGTGATATT TGGAATTACT 12720 

ATTATTTATC AGATAAGAT6 CCTGCACAA6 GGTGGAAAAT ACACATAAGC TCCCAAATAA 12780 

AAGACGCTGT AAATATTTTT AAGATTGTGT ATAAACTATC CCAACTAAAT AATTGTAGCT 12840 

TTAAAGTTGT TAAAAATTTA GAGGAATTAA AAAAAATTAA TTCCCCTAGG GAAATGAGCC 12900 

CTACTGCTAA CAAATTTATA ACTCTATATC CTAAGTCAGA ATCTGAAGCT AAGAGTATGA 12960 

TTTGTAATCT TACGAATAGA CTGTCAGAAT TTAAGGCTCC AAAAATACTA TCTGACTATC 13020 

AATGTGGAAT GCATTCTCCA GTTCATTATA GATATGGGGC TTTTTTAAAA AAACAAGCTT 13080 

ATGATGAAAA AAATAAAAAA GTCATCTATT TATTGCTAGA TGAAAAAAGG AAGAACTATG 13140 

TAGAAGATAA GAGACAAAAT TTCCCTAGTC TTCCTAGCTG GAAAATGGAT TTATTTTCAG 13200 

AAGAAG 13206 
(2) INFORMATION FOR SEQ ID NO: 34: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13104 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

CCGGATCCAG CGAAAAATAT GCTCTTTGAT GCTGTAAGTG GTCAAAAAGA TGCTAAAACA 60 

GCTGCTAACG ATGCTGTAAC ATTGATCAAA GAAACAATCA AACAAAAATT TGGTGAATAA 120 

AAAATTTGTT CAAGGGGGGT GGAAATCAAA TCCCCCTTTG AATTTATCAA TAGAGACACA 180 

AATAATTTAG CTTTCTTATA AAAAAGTAGT ATCCTATGAA AGGAGTTAAT ATGGAAAAGC 240 

AACAACCTAG TAAAGCAGCC CTGCTGTCTA TCATTCCTGG GTTAGGACAG ATTTACAATA 300 

AACAAAAAGC CAAAGGTTTT ATCTTCCTTG GTGTAACCAT CGTATTTGTC CTTTACTTCC 360 

TAGCACTTGC AACCCCTGAA TTGAGCAACC TCATCACTCT TGGTGACAAA CCAGGTCGTG 420 

ATAATTCCCT CTTTATGCTG ATTCGTGGTG CCTTCCATCT AATCTTTGTA ATCGTTTATG 480 

TACTCTTTTA TTTCTCAAAT ATCAAAGATG CACATACGAT TGCAAAACGC ATTAACAATG 540 

GAATTCCAGT TCCACGCACA CTCAAAGACA TGATCAAAGG GATTTATGAA AATGGCTTCC 600 

CTTACCTCTT GATCATTCCA TCTTATGTTG CCy^TGACCTT CGCGATTATC TTCCCAGTTA 660 

TCGTAACCTT GATGATCGCC TTTACCAACT ACGACTTCCA ACACTTGCCA CCAAACAAGT 720 

TGTTGGACTG GGTTGGTTTG ACCAACTTTA CAAACATTTG GAGCTTGAGT ACCTTCCGTT 780 

CTGCCTTTGG TTCTGTTCTT TCTTGGACTA TCATTTGGGC TTTGGCAGCT TCTACTTTAC 840 

AAATCGTAAT TGGTATCTTC ACA6CTATCA TTGCCAACCA ACCATTTATC AAAGGAAAAC 900 

GTATCTTTGG TGTTATTTTC CTTCTTCCTT GGGCTGTCCC AGCCTTCATC ACTATCTTGA 960 

CATTCTCAAA CATGTTTAAC GATAGTGTCG GTGCTATCAA CACTCAAGTA TTGCCAATCT 1020 

TGGCTAAATT CCTTCCTTTC CTTGATGGAG CTCTTATTCC TTGGAAAACA GACCCAAC'IT 1080 

GGACTAAGAT TGCCTTGATT ATGATGCAAG GTTGGCTCGG ATTCCCATAC ATCTACGTTC 1140 

TGACCTTGGG TATCTTGCAA TCTATTCCTA ACGACCTTTA CGAAGCAGCT TATATTGACG 1200 

GTGCCAACGC TTGGCAAAAA TTCCGCAACA TCACTTTCCC AATGATTTTG GCTGTTGCGG 1260 

CACCTACTTT GATTAGCCAA TACACCTTCA ACTTTAACAA CTTCTCTATC ATGTACCTCT 1320 

TCAATGGTGG AGGACCTGGT AGTGTGGGAG GTGGAGCTGG TTCAACCGAT ATCTTGATCT 1380 

CATGGATCTA CCGTTTGACA ACAGGTACAT CTCCTCAATA CTCAATGGCG GCAGCTGTTA 1440 

CCTTGATTAT CTCTATCATT GTCATCTCAA TCTCTATGAT CGCATTCAAG AAACTACACG 1500 
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CATTTGATAT GGAGGACGTC TAAGATGAAT AACTCAATTA AACTCAAACX3 TAGACTGACT 1560 

CAAAGCCTTA CTTACCTTTA CCTGATTGGT CTATCAATTG TAATTATCTA TCCACTGTTG 1620 

ATTACCATTA TGTCAGCCTT TAAAGCAGGT AACGTCTCAG CCTTTAAACT AGATACTAAT 1680 

ATCGACrrCA ATTTTGATAA CTTTAAAGGC CTCTTCACTG AAACCTTGTA CGGTACTTGG 1740 

TACCTCAACA CTTTGATTAT CGCCTTAATT ACCATGGCTG TTCAAACAAG TATCATCGTA 1800 

CTTGCTGGTT ATGCTTACAG CCGTTACAAC TTCTTGGCTC GTAAACAAAG TTTGGTCTTC 1860 

TTCTTGATCA TCCAAATGGT GCCAACTATG GCCGCTTTGA CAGCCTTCTT CGTTATGGCG 1920 

CTTATGTTGA ACGCCCTTAA CCACAACTGG TTCCTCATCT TCCTCTACGT TGGTGGTGGT 1980 

ATCCCGATGA ATGCTTGGCT CATGAAAGGC TACTTCGATA CAGTGCCAAT GTCTTTAGAC 2040 

GAATCTGCAA AACTAGACGG TGCAGGACAC TTCC6CCGCT TCTGGCAAAT T6TTCTACCA 2100 

CTTGTTCGCC CAATGGTTGC CGTACAAGCT CTCTGGGCCT TCATGGGACC TTTCGGGGAC 2160 

TACATCCTCT CTAGTTTCTT GCTTCGTGAG AAAGAATACT TTACTGTTGC CGTAGGTCTC 2220 

CAAACCTTCG TTAACAATGC GAAAAACTTG AAGATTGCCT ACTTCTCAGC AGGTGCTATC 2280 

CTCATCGCCC TTCCAATCTG TATTCTCTTC TTCTTCCTAC AAAAGAACTT TGTTTCAGGA 2340 

CTTACAAGTG GTGGCGACAA GGGATAATTT ATCCCCGCCA CCCTTTTTCA TTTTATACTC 2400 

TTCGAAAATC TCTTCAAACC ACGTCAGCTT TATCTCCAAC CTCAAAGTTG TGCTTTGAGC 2460 

AACCTGTGGC TAGTTTGCAC TTTGATTTTC ATTGATTATT AGCAATTGTC ACTGTAAATA 2520 

ATATCCTTGT AGCAAGCAAT TTTTCTCCTA GACTTGAAAT AAAGCGCATT TCTCTATATA 2580 

ATAATACTCA TATAGAAAAC ACCTTTTAGA AAGATACCTA TGCTTCCATA TCXZATTTTCC 2640 

TATTTTTCAA GTATTTGGGG GGTTCGTAAG CCCCTGTCCA AACGTTTCX3A GCTCAACTGG 2700 

TTTCAACTTC TCTTTACCAG TATCTTCCTT ATCAGCTTGT CTATGGTACC CATTGCTATC 2760 

CAAAACAGCT CCCAGGAGAC CTATCCGCTA GAAACTTTTA TCGATAATGT CTATGAACCT 2820 

CTGACAGATA AGGTTGTCCA GGATCTCTCT GAACATGCTA CAATTGTCGA TGGCACATTA 2880 

ACTTATACTG GAACAGCTAG TCAAGCCCCT TCTGTTGTGA TTGGTCCAAG TCAAATCAAG 2940 

GAATTACCTA AGGACTTGCA ACTGCATTTC GATACAAATG AGCTAGTCAT CAGCAAGGAA 3000 

AGCAAGGAAC TGACCCGCAT CTCTTACCGA GCCATTCAGA CTGAGAGTTT CAAAAGCAAA 3060 

GACAGCTTGA CCCAAGCAAT TTCTAAAGAC TGGTACCAAC AAAATCGTGT CTATATCAGC 3120 

CTCTTCCTAG TTCTCGGTGC GAGCTTCCTC TTTGGTTTGA ATTTCTTTAT CGTCTCTCTT 3180 

GGAGCTAGCT TTCTCCTTTA TATCACCAAA AGATCACGCC TCTTTTCATT TAATACCTTT 3240 
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AAAGAGTGCT ACCATTTTAT CTTGAACTGT TTAGGATTGC CGACTCTGAT TACACTTATT 3300 

TTGGGATTAT TTGGCCAAAA TATGACAACC CTGATTACTG TACAAAATAT TCTTTTTGTT 3360 

CTGTATCTGG TCACTATCTT TTATAAAACA CATTTCCGTG ATCCAAATTA CCATAAATAG 3420 

GAGATTTTTA TGCCCGTTAC GATTAAAGAC GTGGCCAAGG CTGCTGGTGT TTCGCCTTCA 3480 

ACCGTAACCC GTGTTATTCA AAATAAATCA ACCATTAGCG ACGAAACAAA AAAACGTGTT 3540 

CGCAAAGCTA rCAAGGAACT CAACTACCAC CCAAACCTCA ACGCTCGTAG CTTGGTAAGC 3600 

AGCTATACTC AGGTTATCXSG ATTAGTTCTT CCTGATGACT CAGACGCCTT CTACCAGAAT 3660 

CCTTTCTTTC CATCGGTTCT ACGTGGCATC TCTCAAGTCG CATCTGAAAA CCACTATGCC 3720 

ATTCAGATAG bAACAGGGAA AGATGAGAAG GAGCGTCTCA ACGCTATTTC ACAAATGGTC 3780 

TACGGCAAGC GTGTAGATGG GCTAATTTTT CTCTATGCCC AAGAAGAAGA CCCTCTCGTA 3840 

AAACTCGTCG CAGAAGAACA GTTCCCCTTC CTTATCTTAG GTAAATCTCT ATCTCCTTTC 3900 

ATCCCACTTG TCGACAACGA CAATGTTCAA GCTGGTTTTG ATGCGACTGA ATATTTCATC 3960 

AAAAAAGGCT GCAAACGCAT TGCCTTTATC GGAGGAAGTA AAAAGCTCTT CGTGACCAAA 4020 

GACCGTTTAA CAGGCTATGA ACAGGCGCTT AAACATTACA AACTTACCAC TGACAACAAT 4080 

CGCATCTACT TTGCCGACGA GTTTCTGGAA GAAAAGGGCT ATAAATTTAG CAAGCGATTA 4140 

TTCAAGCACG ATCCACAAAT TGATGCTATC ATCACAACCG ATAGCCTCCT AGCTGAAGGT 4200 

GTTTGTAACT ATATTGCCAA ACACCAGCTG GATGTCCCTG TTCTCAGCTT TGACTCGGTT 4260 

AATCCCAAGC TCAACTTGGC AGCCTAT6TC GATATCAATA GTTTAGAGCT TGGTCGTGTT 4320 

TCCCTTGAAA CTATTCTCCA GATTATTAAT GATAATAAAA ACAATAAACA AATTTGTTAC 4380 

CGTCAATTGA TCGCCCACAA AATTATCGAA AAATAAGAGA CTGGGCAAAA AGTCGTTAAA 4440 

AGCAAAAACG CATACTATCA GGTATTGAAA AAACTTGATA CTATGCGTTT TATTGTGGGA 4500 

AGATTTACTT CCTTTTCTAC TGAAATTGAG TCTTTTCCCA AGATCTTTTT ATACTCAATG 4560 

AAAATCAAAG TGCAAACTAG GAAGCTAGCC GCAGGTTGCT CAAAACACTG TTTTGAGGTT 4620 

GTAGATGAAA CTGACGAAGT CAGTAACCAT ACCTACGGCA AGGTGAAGCT GACGTGGTTT 4680 

GAAGAGATTT TCGAAGAGTA TTAATCACTA ATTATCTATC TCAACAAATC TTCCTAGAAT 4740 

ATGAACATTT TCCGAGACAG AGACAAAGGA GCTTGGATCC ACTTGTGTCA TAATCTGTTT 4800 

AAATTCATTA AACTCTGCAC GTGTAATGAC AGTGATTAAA ACTGCCTTTC TCTCGTGATT 4860 

ATAGGTTCCT TCTGCATCGT GGATCATGGT TCCTCCGCGG TGCAATTTTT TATGGATTTT 4920 

TTCAATTACC TTCTCTGGAT GATTTGTCAC AATCATGGCC TGCATACGCT TTTGCTTAGT 4980 

AAAGACTGCG TCTGTCACAC GGCTAGAGAC AAAGATGGTA ATCATAGAAT AAA6AGCGTA 5040 
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TTTCCAACCA AAGGTCAAAC CTGCTATCAG CATGATAGTT CCATTTACCA AGAAAGAAAT 5100 

ACTACCGACA TTCTTACCCG TTTTCTTACG AATAGTCAGG CTGACGATAT CCX3TCCCACC 5160 

ACTGGAGATA TTGTTTCGAA GAGCAAAACC AATCCCCAAA CCCATAACAA CACCCCCAAA 5220 

AAGGGAATTG ATAATGGGAT CCTCTGTCAA GGTTGCCACA GGGACAAACT GGATAAAGAA 5280 

GGAACTCATA GATACCGTGA TAAAGGTAAA GACGGTGAAC TTATGGCCAA TCTGATACCA 5340 

AGCTAAGACC ATCAAAGGGA AGTTAATGGC GTAGAAGCTT AGCGAAATCG GAATATGAAA 5400 

ACCAAACCAG TGATTACTCA AGGCAGAGAT 7ATCTGTGCC AGACCTGTTG CACCACTCGA 5460 

ATACACATGC CCTGGTTGGA AAAAGAAATT AACTGCTACT GCTGATAAAA AACCATAGAC 5520 

CAGAGAGGCC GAAATCTTCT CATCATACTT TTCTCGAGAG ATACTTTGTA AGACACGTAA 5580 

AATTTTTATC TGATAAGCAA AGCGGCGCAG ATAATAGCGC CACXGCTTAA TTCGTTTTGT 5640 

TTGTTTCATC TTCTTCTACT TGTAAGCTGA GTTCCTCTAG TTGTTTGAGA GCGACTGTTG 5700 

ATGGAGCTTG TGTCATTGGG TCAGTTGCCT TGTTGTTCTT AGGAAAGGCA ATGACTTCAC 5760 

GGATATTTTC TTCTCCAGCA AGCAACATGA CAAAACGGTC AAGCCCGATA GCCAAACCAC 5820 

CGTGTGGTGG GAAACCATAG TCCATGGCTT CAAGAAGGAA ACCAAACTGG TCATTGGCTT 58B0 

CTTCAGTTGA GAAACCAAGA GCCTTGAACA TGCGTTCTTG AAGGTCTTTT TGGTTGATAC 5940 

GAAGGCTACC ACCACCAAGC TCATAACCGT TCAAGACGAT ATCGTAAGCA ATGGCACGAA 6000 

CCTTAGCCAA ATCACCTTCT AATTCATGAG CAGTCTCTTC CTGTGGAAGT GTGAAAGGAT 6060 

GGTGGGCGCT CATGTAGCG6 CCTTCTTCTT CAGACCATTC AAACATCX5GC CAGTCAACCA 6120 

CCCAAAGGAA GTTGAACTTA TCATTATCAA TCAAGCCAAG CTCTTTAGCA ATACGTCCAC 6180 

6AAGGGCACC CAGTGTTGCA TTAGCCACTT CAAGCGTATC CGCCACAAAG AGAACCAAGT 6240 

CCTTATCTTC AAGAACAAGC GCTGTTGTCA ATTCTTCTTG GATACCAGTC AAGAACTTGG 6300 

CAACTGGTCC GTTTAATTCT CCATCAACCA CCTTGACCCA AGCAAGACCT TTGGCACCAT 6360 

ACTGTTTGGC TACTTCCGTC ATCTTGTCGA TGTCTTTACG TGAATAGTTG TCCGCAGCTC 6420 

CTGTGACCAC AATCGCTTTT ACAGCAGGTG CTTCTGAAAA GACTTTAAAG TCTACACCTC 6480 

GGACCACTTC TGTCAAGTCC TGAAGCAACA TGTCAAAACG AGTATCTGGC TTGTCAGAAC 6540 

CGTAAAGAGC CATAGCATCA TCGTATTTCA TACGAGGGAA TGGTAGCGTT ACTTCGATGC 6600 

CTTTTGTTTC CTTCATCACG CGCGCGATCA AGCTTTCTGT AATATCTTGG ATTTCTTGCT 6660 

CAGTAAGGAA GGACGTTTCC AAGTC6ACCT GAGTAAATTC AGGCTGGCGG TCTCCACGCA 6720 

AGTCCTCGTC ACGGAAACAT TTAACGATTT GGTAGTAACG GTCAAAACCA GCATTCATCA 6780 
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AGAGCTGTTT CGTGATTTGT GGACTTTGAG GAAGAGCGTA AAAATGCCCC TTATTAACAC 6840 

GAGACGGCAC TAAATAATCA CGCGCCCCTT CAGGCGTTGA CTTAGAAAGG AATGGTGTCT 6900 

CCACGTCGAT AAACTCCAAC TCATCCAAGT AGTTGCGGAT AGAGTGGGTC ACCTTGGCAC 6960 

GAAGTTTAAG ATTTTCCAAC ATTTCTGGAC GACGAAGGTC AAGGTAACGG TAACGCAAAC 7020 

GTGTATCGTC ATTTGCCTCA ATGCCATCCT TAATCTCAAA TGGTGTTGTC TTAGCTGTGT 7080 

TAAGCACAAT AAGAGCTGTC ACGTTTAACT CAACCGCACC AGTTGGCAAC TTATCATTGG 7140 

CTTGTCACGC 6CAGC6ACCT GACCAGTCAC CTCAATAACA AATTCGCTAC GAAGGcTTTC 7200 

AGCTGTTGCC ATAACCTCTG CAGATACTTT TTCAGGGTTG ATAACCAACT GCATGATTCC 7260 

TTCACGGTCA CGAAGATCGA TAAAGATCAA ACCACXZAAGG TCACGACGAC GGCCAACCCA 7320 

TCCTTTCAAG GTTATTTCTT GTCCGATGTG TTCCTCACGA ACACGACCAG CATACATACT 7380 

ACGTTTCATT ATTTCTCTCC TCTTTTATTC TGTTACTATT TTACCATAAA AGCGCAGCTC 7440 

TTCATGAAAA TCATCAGAAA AGTTTGCCAG TCTTTAAAAG TCAGGTGAAA GCCCTAAAAA 7500 

TTAGCGCTAA TACTCTTCGA AAATCTCTTC AAACCACGTC AGCGTCGCCT TACCGTATGT 7560 

ATGGTTACTG ACTTCGTCAG TTTCATCTAC AACCTCAAAA CCATGTTTTG AGCTGACTTC 762 0 

GTCAGTTCTA TCCACAACCT CAAAACAGTG TTTTGAGCAA CCTGCGGCTA GCTTCCTAGT 7680 

TTGCTCTTTG ATTTTCATTG AGTATAATAC AAAAATCCGA TGAACTTCAC CGGACTCTTT 7740 

TATTTTGAAT TTTTGCCTGC TTTACGCTTT TCAGCGATTT CGGCTGCCTT TCGA6GCAAG 7800 

ACAATTTCCG TTATGTAAGC CGTCCCAAAA CGCAGTACAC CTGCAATAGG AGCAAAGACA 7860 

ACTGCTAGAT AGTTATAGAA GAAATCGCCT TTGAAGGCAT AAGCTAGCGC TCCAATGATG 7920 

AAAAATAGAA CGACTGCCTG AATCACTGCT AATAAAATTA CTCGTTTCAT GTGACCTCCT 7980 

GACTCTATTA TAGCATGAGA ATCATCAAAA AGCCX3ACTAA ATTATTCAAA GCGTGAAGAG 8040 

AAATACTGTA GACCAGACCT TTTCTGCTAA TGTAAGCCAA ACCCAAACTA AAACCAAGGC 8100 

TAAAATAGAC AAAAAATTGT TGCACATCAC CTGGAAAATG AATCAAGGCA AATAGAAGAC 8160 

TAGATACCAG AAGAAAAATC AGGGTTCGTT TACTATTGTC CTGCTTAGGA AAGAGATAGC 3220 

GTGCTAACAT CCCTCTAAAA ACAATCTCTT CCGTCAAAGG AGCAAAAATA ACCACAGCAA 8280 

AGAATGAGAA AAGTGGTTGA GACAAGGTCA AGTCTGTCGC TATTTGCTGA TTTACTGAAG 8340 

GATCATCTGG CAAGAAGAAT TGAACGACCA GAGATAAGAA CCAAACCAA6 ACAGGAAGCC 8400 

AAATAAATCG ATTAAAGCC6 CTCTTCTCAA TATGAACAGG AGCCTTCTGA TACCATTTGT 8460 

AAATGCCGTA CACATATACT CCAGCCAAGG CCACATAGAG TAGAGTAACA GCATAGGGTG 8520 

AAGCGCC7AA AGCAAGCGAC GCAGTCGCGA GCCCCTGAAT AAA6CCATAG ATAAATAAAA 8580 
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AGGATAGAAG 


GGCTAGAAGA ATCCAGCCAA 


GGTTTTTAAG 


TAATTTCATA 


GATAACTCCT 


8640 


TTATTTGAAA 


TAACGTTTTA 


CCATAGGTAA 


CTGCATCACA 


TTGATATAAA 


CATGGATGGC 


8700 


TCCTACAAGC 


AAGAAAGCTA GTAACTGAAT 


CTCTCCTGTC 


AAGAAAGAAA 


TGATAATAAG 


8760 


AAAAATATAT 


AAGGCTGGTA 


AGACATATTG 


GTGTAATTGG 


AATAAAATTC 


GAAAACTCTG 


8820 


TTCCAAATTA 


GCCTGACGCT 


CCCCTTCATC 


ATAAGAATTT 


ATATAGTTCA 


AGACATCCTT 


8880 


TGGTGTAGCG 


AAAAATTCCA 


AATCAAACTG 


ACGAACAATC 


GCAATGGTTT 


TAAAAAGAGA 


6940 


TTTTTGAGCG 


ACTAAGAATA 


CCACAAAGAG 


TAAGAAAGAA 


AGGAAAAATG 


TTTGAGGGTT 


9000 


TGTATGCAAT 


ATAATCACCT 


CACTTAATGA 


AATAAAAATA 


GCCAATGGAA 


TCGCTACACC 


9060 


TGTAATATTA 


AAAGCAATGG 


TTCCAAACTC 


AAGATTCCGA 


TACATTTGCA 


CATAATAGGT 


9120 


TTCATTCAGA 


TCGTCATCCA 


TTTCCTCTTG 


ATACAAAGAA 


TGAAATTTTC TGCTTTTCTT 


9180 


TAAGAAATTG 


AAAGTCAAAA 


ACATACTAAT 


GAAACCTATC 


AGTAAACAAA TAGCTGATAT 


9240 


CCATGGCATC 


AAGGCTTTTA 


CATCTAAAAT 


AATTTCGTGG 


GATTCGACAC GTGCCTTAAA 


9300 


CATCCCTACA 


AACATGCCCA 


AGAACCCCCC 


AAGACAATAG 


ACATCAAAAA TAACAATCTA 


9360 


CGTTTCTTTT 


TCATATTCAT 


TCTCCTTTTT 


CACTTGCTAG 


ATTTTTGGAT 


TTCTTTTCAA 


9420 


TCCATTCAAT 


TACTGGGATG 


AGAGCAAAGT 


AGACCCAAAC 


AAATTGGTCG 


CTTTGATAGG 


9480 


GATTAAACCA 


GCTTAGGTCC 


ATCCCAATCA 


GTAGAAATAC 


GCTGACTAAT 


AAAGCTATGA 


9540 


CCACTACATA 


ATAAATCACT 


TTATACTTGT 


TCATCACTCG 


TCCTCCTCCA 


AACGAAATAC 


9600 


CGATTCGACT 


GTTTCGTTGA AAATTTGAGA 


TATTTTCAGG 


GCAATGATAA TGGATGGGGT 


9660 


GTACTCATCC 


CGTTCTAGTA 


GGCTAATGGT 


CTGTCTG6AA 


ACCCCTGCCA GTTTGGCTAG 


9720 


GTCGGTTTGA 


TTGAGACXy^T 


CX3CGAGCTCG 


AAGCTCTTTT 


A6ACGATTTT 


TTAGTTGCAT 


9780 




TACTCTCCGT 


CAAATTCAAC 


GGTTTGGATA 


TCCTCAATAC GTTGCAACTT 




GAATTTTTCT 


TTTCCCGTAT 


TATCTACACG 


TCGTAGCTTT 


ACCCATTCCT 


CATCAACATC 


9900 


CACAACTTCC 


CAGTTATCTG 


GCCCAATATA 


CACTCCCGTT 


ATAATTGGTT 


CCTTTCCAAT 


9960 


CATTTCTTGT 


AATAATCTCG 


ACATTTCTGC 


GTTTCCTTTC 


TCTTTTCGCT 


CAAGTCTTTT 


10020 


GATTTTATTC 


TCTAGTTTCT 


TGATTTTTTT 


AGAATTATTA 


GAATAAAAGA 


AAATCATAAA 


10080 


TAGTATAAAT 


CCTAGTACCC 


ACATTATAAC 


TCCTTTCTGC 


TTCCTATTTC 


TTAACTTGAA 


10140 


TTCATTGTAA 


CATATCTTTT 


TCTTTTTGAC 


AAGTATAGTT 


GTCAAAAAAA TTATGATTTT 


10200 


TGTCATTTTG 


CAAAAGAAAA AGGTCAGGAG 


TAGGTTCCTG 


ACCACTTTAT 


CTATCATTAA 


10260 


TACTCTTCTA 


AAATCTCTTC AAACCACGTC AGCTTCACCT 


TGCCGTAGGT AT6GTTACTG 


10320 
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TTTCATCTAC 


AACCTCAAAA 
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CCATGTTTTG 


AGCTGACTTC 


GTCAGTTCTA 


10380 




CAAAACCATG 


TTTTGAGCTG 


ACTTCGTCAG 


TTCTATCCAC 


AACCTCAAAA 


10440 


CCATGTTTTG 


AGCTGACTTC 


GTCAGTTCTA 


TCCACAACCT 


CAAAACAGTG 


TTTTGAGCAA 


10500 


CCTGCGGCTA 


GCTTCCTAGT 


TTGCTCTTTG 


ATTTTTATTG 


AGTATAAAAT 


CCTAGTTTTT 


10560 


CAAAGATTTC 


TGAGAAGTTT 


TGGCTGATTG 


TCTCAAGTGA 


CACTTGCACT 


TCTTCTCGGG 


10620 


TTTGGTTGTT 


CTTGACCGTC 


ACTTGTCCGC 


TTTCGACTTC 


GCTCTCTCCT 


AGGGTGATGA 


10680 


GGGTCTTAGC 


CGCAAAGACA 


TCGGCTGACT 


TGAACTGAGC 


TTTTAGTTTA 


CGGTTGAGGT 


10740 


AATCACGCTC 


T6CTTTGAAA 


CCTTGTTGGC 


GAAGAGCCTG 


TACCAATTCC 


AAGGCCTTGA 


10800 


TATTTGCCCC 


TTCGCCCAAG 


ACTGCGATAT 


AGACATCTAG 


GGCGTTTTCG 


ATAGGGAGGG 


10860 


TCACACCTTG 


CTTTTCAAGG 


ATGAGAAGCA 


GGCGCTCTAC 


ACCAAGTCCA 


AAACCAAATC 


10920 


CAGCAGTTTC 


AGGGCCTCCA 


AAGTAAGCAA 


CCAAACCATC 


GTAGCGACCA 


CCCGCACAGA 


10980 


CGGTCAGGTC 


ATTGCCCTCA 


ATCTCTGTGA 


TAAACTCGAA 


AATGGTGTGG 


TTGTAGTAGT 


11040 


CCAGACCACG 


CACCATATTG 


GTATCGATGA 


TGTAATCTAC 


TCCAAGATTT 


TCCAACATCT 


11100 


GACGCACAGC 


ATCAAAATGA 


GCTTGGCTTT 


CTTCATCAAG 


AAAGTCCAAG 


ATAGACGGCG 


11160 


CATTCTCTAC 


TGCCACCTTG 


TCTTCTTTTT 


CCTTAGAGTC 


CAAGACACGA 


AGAGGATTTT 


11220 


CCTCCAAGCG 


ACCTTGGCTA TCCTTAGACA AGGTCTCCTT 


GAGCGGTGTC 


AAATAGTCAA 


11280 


TCAAG6CTTG 


GCGGTAGGCT 


GCACGGCTCT 


CAGGATTTCC AAGAGTGTTG 


AGGTGCAATT 


11340 


TGACACCTTG 


AATACCGATT 


TCCTTCAAAA AATGGGCTGC 


CATAGCGATT 


GTTTCCACAT 


11400 


CGGTAGCTGG 


ATTGCTAGAG 


CCAAAACACT 


CAACACCAAT 


CTQGTGGAAT 


TGGCGCAAGC 


11460 


GCCCTGCCTG 


TGGACGCTCA TAACGGAACA TAGGTCCCAT GTAGTAGAAC 


TTGCTTGGCT 


11520 


TTTGCACTTC 


TGGGGCGAAA 


AGTTTATTTT 


CCACATAGGA ACG6ACAACG 


GGT6CAGTTC 


11580 


CTTCTGGACG 


GAGGGTAATA 


TGACGGTCAC 


CCTTGTCATA 


AAAATCGTAC 


ATTTCCTTGG 


11640 


TTACGATATC 


CGTTGTATCT 


CCGACAGAGC 


GACTGATAAC 


CTCGTAATGC 


TCAAAAATAG 


11700 


GCGTGCGCAC 


TTCTGCATAG 


TTGTAGCGTT 


TGAAAATCTC 


ACGGGCAAAG 


CCCTCAACGT 


11760 


ACTGCCACTT 


AGCAGACTCA 


GCAGGTAAAA 


TATCCTGCGT 


TCCTTTTGGT 


TTTTGTAATT 


11820 


TCATAGGGAA 


TCCTCTTTAA ACTTAATAGT 


CTTATTTTAC 


CATAAATAGA 


GGGATTAAAA 


iisao 


CAGTAAGAAA 


AAAATTAGGA 


TTTAGATATC 


ATTTTTGAGA 


TTAAGAATTG 


TCAAAAAAAT 


11940 


AGCTAGCAAG 


GAAAGACCAA 


CAAATAGCAT 


CCAAGTCAAC TGTATATTCC 


ATACG6CTAC 


12000 


TA6TGAAAAA 


CAAGCTGTTC 


CCACAGGTAT 


GGATAAGGTA AACAATAGAC 


CTAAAAAATT 


12060 


ACTA6TACGA 


GCTAGAACCT 


CTGGAGCTAG ATTTTTCATG AGCATGGCAC 


TAATCTTTGG 


12120 
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TTGAACTTTA 


CCAGACACAT 


ACAGAGTAAA GAAGAGAAAT AGCAAACCAA 


GCACGACTTG 


12180 


ATTGAATAAA 


TTAGCCAAAC 


CAACTAGACT AAGTCCTACG 


GTCTCCCACA 


TCATCAATCT 


12240 


AGGCAAGGAC 


TGCTTCCCAA 


AATAATCATT GCCCGTAAGG 


CTACTGATGA 


TGACTGATAC 


12300 


TAAAACACAG 


AATTGATTGA 


TAAATAGTGC CTCTOTATAA GAAAAATTCA AGAGAGAATG 


12360 


GCTCAAAAAG AAGATATTAT 


AAATTCCACC 


CAAAGCGCCA 


CCCAAGGAAT 


TAATAAGCAA 


12420 


GACAGCAAAG 


AGCATAAAAC 


CAAAGTTTTT 


CTGTCCACTT 


TTAAGAAAAA 


CGAGACGTAA 


12480 


ATTTCGGTAA 


ATTGTTAGGA 


ACTGGTCTTT 


GATAGAAAGC 


TTCTCATTTT 


TTAAGTTTTC 


12540 


ACCATCAGCA 


GATGACATTG 


ACAGGCTCAA 


TTTGCTTTTT 


CCTAAAAAGA 


GGATAGTGGC 


12600 


TGATACTAG6 


AAAAAGCAGG 


CATTGATTCC 


CGCAACGAGA 


GAAAAATTGT 


TGACCGATAG 


12660 


AGCTAAGAGC 


CA6ACTCCGA 


AAGCTTGACC 


ACCAATAGCT 


GAAATATAGG 


TGATGAACTG 


12720 


TGAAAAAGAA 


TAAGCCTCCA 


TCAGATCATC 


TTCAGCTACT 


TTTTCCTTAA 


TAAGAGGCAT 


12780 


ACGCAGGCCA 


CCTGCAAAAT 


CACTGATGAT 


ATCACTAATG 


ACATTGATCA 


AACACAGGCT 


12840 


AGAAAAGGCA 


AAGAGACTAG 


CTTGCTGAAC 


AACTAGGGCT 


GCTAGAAAAA 


ATAGAACCGC 


12900 


CTGAAACAAA 


CCGCTATAGA 


CCATCCATTT 


GACCTTGTCC 


CTCGTGTAAT 


CTGCCCGAAT 


12960 


CCCTGCAAAA 


ACTGTAAAGA 


GGGTCGGAAG 


AATCATGACA 


ATATTCGCCA 


TAGCAACAGC 


13020 


AAAAGATGCT 


TGTGACAAGG 


TCGATGCATA 


GACGATAAAG 


ACCAGGTTGA 


AAATCGAAAC 


13080 


ACCAAAAGCA 


TTGAAGAAGC 


GTGG 








13104 



<2) INFORMATION FOR SEQ ID NO: 35: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19250 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : double 

(D) TOPOLOGY: linear 

<xi} SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

CCGGGCAAAT AGTTTTGAAC TTTTCATCAT TTTCTCCTTT AAAACTTTCT CTCCATTATA 60 

GACTCTTTTC AGAAAGTTGT CAACAGAATT TTCAGAATTT TTGAAAATTA TTTTTCAAAC 120 

AACATCTTTG CAAAAAATAT GAATATCGTA AGCGCGTCAT AACAAGGTAT CTATCATTCA 180 

TGGAGCTCCT CCTGTATACT ATTAGTAAAG TAAATATTGG AGGATATTTT AATGCCACAA 240 

CCTATTGTTC CTGTAGAGAT TCCACAATCT CGTCGTTTTG ATTCTAAAAA GAGAAATGAT 300 

ATTCTrCTTA AAATTCGTAT TGGCAAGCTT GAAGTAAGTT TTTTTCAATC TCTCAATCTC 360 
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GAAATGATAG 


AACAGCTTTT 


GGATAAGGTG 


TTGCTCTATG 


ACAATTCATC 


TATCTAGCCT 


420 


AGGGCAGGTC 


TATCTCGTGT 


GTGGGAAAAC 


TGATATGAGA 


CAAGGAATCG 


ATTCACTGGC 


480 


TTATCTCGTT 


AAAACCCACT 


TTGAATTGGA 


TCCTTTCTCC 


GGTCAAATCT 


TTCTCTTTTG 


540 


TGGTGGACGT 


AAAGACCGCT 


TTAAAGTCCT 


TTACTGGGAT 


GGTCAAGGAT 


TTTGGCTACT 


600 


ATATAAACGC 


TTTGAGAACG 


GCAGACTGAC 


TTGGCCCAGT 


ACAGAAAAGG 


ATGTCAAAGC 


660 


TCTCGCACCT 


GAACAAGTAG 


ATTGGCTGAT 


GAAAGGCTTT 


TCTATCACTC 


CAAAAATATA 


720 


GTAGATTGAA 


ACTAGAATAG 


TACACCTCTG 


CTTCTAAAAC 


ATTGTTAGAA 


ATCGATTTTA 


780 


CTGTCCTGAT 


CGATTT6TCC 


TGTTATTATT 


TCATTTTACT 


ATAAATCCAT 


CAGAAAGTCG 


840 


TGATTTCTAT 


TGAAATGAGG ACTTTCTTTT 


TATACTCATC 


TGCTTTCAAA AAGCACTCTA 


900 


GTCCATCTCC 


GATTAACGAT 


GGACTTTATC 


ACCTCCTTCT 


CCAGTCCTTG 


TATAACATCT 


960 


TGAAGTTGAT 


TCATGACATC TTCCAAAGTT CGAAAGGCTT TATTCTTAAA TCCACGTTTA 


1020 


CGAATCTCTT 


TCCACACTTG 


TTCAATGGGG 


TTCATCTCTG 


GTGTGTATGG 


AGGAATAAAT 


loao 


GCAAAGCCAA 


TATTAGTCGG 


AATCTTTAAG 


GTACTTGATT 


TATGCCATAT 


AGCATTGTCC 


1140 


ATAACGAGTA 


AAAGATAATC 


ATCTGGATAA 


GCTTGTGAAA 


GCTCCTATTC 


CTAAAGCCCC 


1200 


TTTATAACCT 


CTTGCGAGAG 


AGACTATTGA 


CTCAGCCCTT 


ACTTCATGCG 


GATGAAACCT 


1260 


CCTATCGGGT 


TCTAGAGAGT 


GATAGCCATC 


TGACCTACTA 


TTGGACTTTT 


TTGTCAGGTA 


1320 


AAGCAGAGAA 


ACAAGGGATT 


ACGCTTTACC 


ACCATGATCA 


GTGTCGAAGT 


GGTTCAGTAG 


1380 


TACAAGAATT 


CCTAGGAGAT 


TATTCTGGCT 


ATGTTCATTG 


TGATATGTTG 


CGGCAGTAAC 


1440 


TTAGGACTTT 


AGTCCTCTAG 


TTCTGCCTAT 


GCGATAGCAG 


TCCAAGGTTT 


AGGAGTAAGG 


1500 


CGACGCTAAG 


CTTGGTAAAC 


TGCGAACAGC 


TAGAAGCTTA 


TCGTCAACTG 


GAAGAAGCTG 


1560 


CACTTGTTGG 


ATGTTGGGCG 


CATGTGAGAA 


GGAAGTTTTT 


TGAAGTGCCC 


CCCAAGCAAG 


1620 


CAGATAAATC 


ATCCTTAGGA 


GCTA/^GGTT 


TAGCCTATTG 


TGATCAGTTA 


TTTTCCTTGG 


1680 


AAAGAGACTG 


GGAGGCTTTG 


CCAGCTGATG 


AACGGCTACA 


GAAACGTCAA 


GAACATCTCC 


1740 


AACCCCTACT 


GGAAGACTTC 


TTTGCTTGGT 


GCCGTCGTCA 


GTCAGTTTTA 


TCGGGTTCAA 


1800 


AACTAGGAAG 


GGCAATTGAA TACAGCCTCA AGTATGAAGA AACCTTTAAG ACCATTTTAA 


1860 


AAGACGGACA 


TCTGGTCCTT 


TCCAATAATC 


TAGCTGAACG 


CGCCATTAAA TCATTGGTTA 


1920 


TGGGACGGAG 


TAAAAGAGTC 


CAGTGGACTC 


TTTTAGCCTA AGCTCAGTTT AAAAAAACGA 


1980 


GGGTGGTTAT 


TTTTAAAAAA 


GCGAGGGTGG 


TTATTTTCTC 


AAAGTTTTGA AGGAGCTAAA 


2040 


6CAAGAGCTA 


TTATTATGAG 


TTTGTTGGAA 


ACAGCTAAAC 


GTCATCAATT ATAGTGCGTT 


2100 


GAATCTATAA 


CAGTAC6CAT 


CGACTGCTAA 


AATATTTCTA 


TAAATCyVATT TTCCTTTCCT 


2160 
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AATCGATTTG TTCATATCTT ATTACAATCC ATTATAAATA GCGAGAAATA TCTATCCTAT 2220 

CTTCTAGAAT GTCTTCCAAA CGAGGAAACT CTCGTAAACA AAGAG6TTTT AGAGGCCTAT 2280 

TTACCGTGGA CTAAAGTTGT ACAAGAAAAG TGCAAATAAG AAATCTCCAG ATTAGGAACT 2340 

ATATATGAGT TCTCTA6TCT GGAGATTTTT CAATAGACTT CGTTATTGGG CGGTTACTTT 2400 

CGAAACTTTG AAAACTTCAA AAAACGGATT TTTATCGCTC TGAACATCAA AAAAGAAAGG 2460 

ACGAAATTTG TCCTTTCTCA AGCTTAGCTT TTCTTCAACC CACTACAGTT GACAAAGAGC 2520 

CCTTTATTCT ATCAAACATG AAGCGCAAAA ACAAGCCAAA AATCCGATAG AATGGCTATC 2580 

CCTCGACTAT CAAGTAAGAC ATTTCCATCA AATACGTTCA ATTTTACTCT TGTTCTACTA 2640 

AGAATTAATC ATCTCGTTTT GATTTATTAA AAATATACAA TTCAGCTTTT CCTCCAAACT 2700 

ATTTTATCCA CTATCCCTGT ATAGCTCTGT ATTATCTTAA CAACTTTAGT AGAGACATTT 2760 

TCCTCAACAT AATCCGGAAC CGGTAATCCA AAATCCTCAT CTTGTGCCAA GCTAACAGCA 2820 

GTTTCAACT6 CTTGAAGAAG AGAATTTTCA TCAATGCCTG CCAAAATAAA TCCTGCCTTA 2880 

TCTAAGGACT CAGGAGGTTC T6TACTT6TA CGAATACATA CAGCGGGAAA AGGATAACCT 2940 

TGACTAGTAA AGAAACTACT TTCTTCCGGT AAAGTTCCCG AATCAGATAC TACAACAAAT 3000 

GCATTCATCT GTAAACAATT ATAGTCATGG AATCCTAGTG GCTCATGCTG AATCACACGT 3060 

TTATCTAGTT TAAAACCGCT CTCTTGTAGC CTTTTCTTTG ATCTAGGATG GCAAGAATAT 3120 

AAGATTGGCA TATTATACTT TTCAGCTAAT TGATTAATTG CTGTAAAGAG AGAAATAAAA 3180 

TTTTTATCTG TATCAATATT TTCCTCACGG TGAGCTGAAA GTAAGATATA ACCTCCTTTT 3240 

TTCAATCCCA AACGTTCATG GATATCTGAA GACTCAATAG CAGATAAATT TTTATGTAAC 3300 

ACTTCTGCCA TAGGAGAACC AGTTACATAT GTGCGCTCTT TAGGTAAACC ACACTCATGT 3360 

AAATACTTAC GTGCATGTTC AGAGTATGCT AAGTTAACAT CTGAAATAAC ATCAACAATC 3420 

CGACGATTAG TCTCTTCCGG TAGGCACTCA TCTTTACAGC GATTGCCAGC CTCCATATGA 3480 

AAAATTGGAA TATGTAAACG CTTGGCAGCA ATAGCTGATA AACAAGAATT TGTATCCCCT 3540 

AAAATCAATA AAGCATCTGG TTTAATTTGA TTCATCTU^TT TGTATGAAGT ATTAATAATA 3600 

TTCCCTACAG TAGCACCAAG ATCATCTCCA ACAGCATCCA TGTATACGTC CGGAGTGTCT 3660 

AACCCTAAAT TATCAAAGAA AATACCATTT AAATTGTAAT CATAGTTTTG TCCAGTATGT 3720 

GCCAAAATAA CATCAAAATA CTTTCGACAT TTAGTGATAA CACTACTTAG ACGTATAATC 3780 

TCTGGACGTG TTCCCACAAT AATCAATAAC TTAAGTTTGC CATTATCTTT AAAGTGAATA 3840 

TCACTATAAT CTGTCTTAAT TTTCATTTAT TTCTCCACTT GTTCAAAAAA AGTATCTGGA 3900 
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TGTCTAGGAT 


CAAATGACTC 


ATTAGCCCAC 


ATGACAGTAA TTAGATTTTC 


TGTATCAGAA 


3960 


AGATTAATAA 


TATTATGTGC 


ATAGCCCG6T 


ATCATATGTA TTGCTTCAAT 


CTTATCGCCC 


4020 


GACACTTCAA 


AGTTCAGAAT 


AGGATACTCT 


TGACCGTTTT CATCCAGCCC 


TATCCTACGC 


4080 


TCTTGTATTA 


AAGCACGACC 


AGAAACAACC 


ATGAAT^TT CCCACTTAGA 


ATGATGCCAA 


4140 


TGTTGCCCTT 


TGGTAATGCC 


AGGTTTAGAA 


ATATTAACAG AAAATTGACC 


CGTATTTTCT 


4200 


GTTTTTAATA 


ATTCCGTAAA 


ACTACCTCGT 


TCATCTATAT TCATTTTTAG 


AGGAAACTTA 


4260 


/ACTTATCTA 


CTGGTAAATA 


AGATAGGTAG 


6TAGAATACA ATTTCTTTTT 


AAACGATCCC 


4320 


TGAGGAATTT 


CAGGCATAAC 


TAAACTATCA 


GGCTGTTTTT TAAATGTTTC 


TAATAGAGAG 


4380 


ACAATCTCTC 


CTAAGGTTGC 


ACGATGAGTC 


GTTGGTACGT AGCAGTAGTT 


TCCTGATGGG 


4440 


CTAG6TAAGA 


TTTGTAATCC 


ATCTAGATTA 


CAACGATGAG GATTTCCTTC 


CAATGCAGTT 


4500 


AGACACTCTT GTATCAAATC ATCAATATAC AGCAACTCCA ATTCTACACT TGGATCATTT 


4560 


ACTTGAATAG 


GTAAATCGTG 


AGCTAGATTA 


TAACAGAAAG TTGCTACAGC AGAATTGTAG 


4620 


TTAGGACGGC 


ACCACTTCCC 


ATAAAGATTC 


GGGAAACGGT AAACTAAGAC 


AGGTGCTCCC 


4680 


GTTTTCTTTC 


CATATTCAAA 


GAAGAGTTCT 


TCCCCTGCTA GCTTAGATTG 


TCCATATATA 


4740 


GAGTTTGAAA 


ATCGGCCTTC 


TAAACTAGCT 


TGAGTAGAAC TTGAGAGTAG 


AACAGGACAA 


4800 


GTGTTTTCAT ACTTTTCTAA AATCTCCAAT AATCTACTTG AAAAACCGTA ATTTCCCTCC 


4B60 


ATGAATTCAT CAGGATTCTG TGGACXyVTTG ACACCAGCTA AATGGAATAC GAAATCGGCC 


4920 


TTCTTACAAT ATTCATCTAA TAAAATCGGA TCTGTATCAC GATCATACTG AAAAATCTCT 


4980 


CCAATCTCTA AATTAGGACG AGTCCTATCT 


CGTCCATCTT TCAAAGCTTC 


CAGAGTACAG 


5040 


ATAAGATTTT TTCCTACAAA TCCTTTCGCT CCTGTGATTA AAATATTTTT AATCATGCCC 


5100 


CCTCCTTATT TTATATGCTG TTTTAATAGT TAACTCTCTC GACAATACAT GATACATTAT 


5160 


ATATCCTTGA 


TAATTTTAAT 


GTATCTTAAA 


AGATTTTACA TCTCTTCGTC 


TGCTACCATA 


5220 


TCACGAATTG 


CTGTCTGTAT 


TTCATCTAAT 


TCTAGCAACT TTCTTTTAAC 


TTGCTCTACA 


5280 


TCXaTCAAAT 


CGGTATTATT 


ACTATTGAAT 


TCTGTCAACA AATTTCTATT 


CGTACTACCA 


5340 


TCTTTGAAAT 


ACTTATCATA 


GTTAA6ATTA 


CGATTATCAC TAGGAACTCT 


ATAAAAATCA 


5400 


CCCAAATCAA 


TTGCATTTGC 


GCACTCTTCG 


TTAGTTAATA GTGTTTCATA 


CCTTTTTTCT 


5460 


CCGTGTCTAA 


TACCTATAAT 


CTTAATATCT 


TGTTCTGAGG CAAAAATTTC 


TGATACAGCC 


5520 


TTAGCCAACA 


CTTCAATCGT 


ACATGCTGGT 


GCTTTCTGAA CTAGTATATC 


TCCAGATTTC 


5580 


CCTTCTTCAA ATGCAAATAA AACCAA6TCT ACTGCTTCTT CCAATGTCAT CACAAAACGT 


5640 


GTCATGCTAG 


GTTCAGTAAT 


TGTAAGAGCA TTTCCTTGCT TAATTTGCTC 


AATCCAAAGA 


5700 



I 
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GGAACXy^CAG ATCCACGGCT 


ACACAGAACA 


TTCCXTATAGC 


GAGTCACACA 


TATCTTTGTA 


5760 


TGCTCAGGAT TTACCGTCCT 


GGACTTAGCA 


ACAGCAATCT 


TTTCCATCAT 


AGCCTTGGAT 


5820 


GTTCCCATAG CATTGACAGG 


ATAAGCCGCC 


TTATCTGTAG 


AAAGACAGAT 


AACTTGCTTT 


5880 


ACACCAGCTT CGATAGCCGC 


AGTGAGGACA 




CCAAAATGTT AGTTTTTACC 


5940 


GCTTCTACAG GGAAAAATTC 


ACAA6AAGGT 


ACTTGTTTAA 


GAGCAGCAGC 


GTGAAAAACA 


6000 


TAATCCACAC CATGCATAGC 


ATTTTTTACC 


GAAGCTAAGT 


CACGCACATC TCCAAGGTAA 


6060 


AAACGGATTT TCCCAGCCAC 


TTCTGGTACT 


TTTACCTGAA 


ACTCATGACG 


CATATCATCT 


6120 


TGTTTCTTTT CATCTCGCGA 


AAATATACGA 


ATCTCTGAGA 


CATCTGTTTC 


TAAAAAACGC 


6180 


TTGAGAACCG CATTCCCAAA 


TGAACCTGTC 


CCTCCTGTAA TTAGGAGAGT 


TTTTCCTGTA 


6240 


AATTGTGACA TATATTACAC 


TTCTCCTTCT AGTATGTCTG 


CAATTTTCTT 


ACAAGCCGTT 


6300 


CCATCTCCAT ATGGATTTGA 


AGCTTGACTC ATTGCTTGAT 


AAACTGAATC 


ATTTTCTAAT 


6360 


AATTCTTTAA AATGCCTATA 


AATATTATTT 


TCATCAGCAC 


CTACAAGTTT 


CAAAGTCCCT 


6420 


GCTTCAATTC CCTCTGGACG 


TTCAGTTGTA 


TCTCTCATAA CCAAAACAGG 


TTTTCXTTAAA 


6480 


CTTGGAGCCT CTTCCTGAAT 


ACCACCACTA TCTGTTAAAA TTAAATAACT 


TCTTGATAAA 


6540 


AAATTGTGAA AATCTAATAC 


TTCTAAAGGT 


TCGATCATCT 


TGATACGTTC 


ACAGCCACTT 


6600 


AGTTCTTCCT CAGCAATTTG 


GCGAACACGA 


GGATTCATAT 


GGATAGGATA 


AATAGCCTTG 


6660 


ACATCTGAAT ATTCTTCAAT 


AATCCTTCTA 


ATTGCTCTAA 


ACATATGTCT 


CATCGGTTCA 


6720 


CCAAGATTTT CACGACX3ATG 


AGCTGTAATT 


AGAATAAACC 


TGCTTTCTCC 


TATCCATTCT 


6780 


AACTCAGGAT GCGTATAGTC 


CTCTTGAATT 


GTAGTTTGTA 


AAGCATCAAT 


CGCCGTATTA 


6840 


CCTGTCACAA ATATGCTCTC 


TGGAGTTTTT 


CCTTCTCTTA AAAGATTATC 


TTTTGAAAGT 


6900 


TGTGTTGGTG TAAAATGATA 


CTGAGCCAAA ACCCCAACTG CTTGACGATT 


AAACTCTTCA 


6960 


GGATATGGTG AATAGATATC 


GTAAGTGCGC AAACCAGCTT 


CAACATGACC 


AATTGGAATC 


7020 


TGTAAATAAA AGGCCGCCAG 


TGAACTAGCG 


AAGGTCGTAC 


TTGTATCCCC 


ATGAACTAAC 


7080 


ACCAAATCAG GTTTTTCTGA 


CTCTAAAATA 


GCCTTCATTC 


CTTCCAAAAT 


GCCAATGGTC 


7140 


ACATCAAATA AAGTTTGTTT 


ATCTTTCATA 


ATAGACAAAT 


CAAAATCGGG 


AATAATCCCA 


7200 


AATGTGTCCA AGACCTGATC 


CAACATTTGA 


CGGTGTTGGC 


CCGTAACGCA 


AACTAATGTT 


7260 


TCAATATTCT TACGTGTTCT 


TAACTCTTTG 


ACCAAAGGAC 


ACATCTTGAT 


GGCTTCTGGA 


7320 


CGAGTTCCAA ATACTACAAC 


TACTTTTTTC ATATATTTAC 


TTACTCCTAA 


CAAATAATGA 


7380 


ACGGTTCTTA AAATAAATTA 


GATAACGGCT AATCCATAAC ACCACCTCAG 


ACATACTTGA 


7440 
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ACAAATAGCT AATGTTACTA AACTAAAATT ATCAGACAAG ATAAATATTC CTAATCCCAA 7500 

AGTTTGGACA ATCGAAGCTA ATATAGTTGT CATTGTAGTT TCTTTCACTT TATCAATAGC 7560 

TCCTAAGACA GGCCATCCGT AAATCATAGA ATAAAAACTA GCAACAAAAG CGGGTAATAA 7620 

GTACTTAAGA AAATCTGCTG AAACGGTATA TTTTTCACCA CCAATTATAG AAAGAATTTG 7680 

ATTTGAAAAG AATAAAACTA TCAAAACTCC AAAGATAATA GGAATAAACA TAATCCGATT 7740 

AATACTCTTA ACCGATTGTA TATCTTTAGT ACGTATCATA TGCGGATATA AACTATTCGC 7800 

TATAGGATTA TACAATGATT TTGCTGCTGA AAGCAGTTGC ATTGCTATCC CCCAAAAGGC 7860 

TATCTCTTGA CTTTGTAAAT AAAAACCCGA AATGACTGTC GTAAAGACGC CAAAAATAGT 7920 

AGTTGCAAAA TTGGATAAAA AATAAATAGA GGATTCCTTT AAATCTTTAA CCCAAACAGA 7980 

CAGATAAGAA AATGATAATT TAATTCCATA ATAATGAAGG AATCTATAAG AAACTACTGC 8040 

AGCAACTAAA TTCCCAATTC CTTCCAATAT AGGAATCCAT AAAATAGAAG AATCATCTTT 8100 

TACTACAATA AATGTCAAAA TTGTAATGAT AGTTTTAGAA ATAATATAAG GAATTGCAAC 8160 

TGCATGCATC TTTTCAATTC CACGAAATAA AAAGTCAAAG ATAAAAATAT TGGTCACTGT 8220 

AGCTAACAAA TAAAAAACTG AAAAAAGAAT ATTCTCTCTC ATTATTGGGA TTTGCCACAT 8280 

CAATATGGTG TAAATTAGAA TCGAAATGAT AGATAAAAAT ATTTTTTCAA CTAGAGTATC 8340 

TCCAACTATC CTTCCAATCT TTGAGGGAGT AGTACAAGCA TTTACAATAT TTTTTGTAGC 8400 

TGATATCATG AAACCAAAAT CAATCACCAG TTGAACATAA GCTATTAACG CTTTAACATA 8460 

AATAACCATT CCATACGCGT CTAGCX3AAAG CACCCTTGTC AAATACGGGA GTGTTAATAA 6520 

A6GAAATAGT AATTTAACAA TATTCAGAAT ATAGAGAGAA CTTGTATTTT TTATAAATGA 8580 

AATTCTATCA ACTTTCACGA ACTAGTCCTT CCAAAAAAAG ATCTAAATAG TCCAAACTAC 8640 

TTCTCGCTTT CAACACCAAT TCTGAAGGTA TTGTTATCGG TTTTAGATGA AAAGTTTCAA 8700 

GTTTCTTTAC AATACTATTA ACACTTGAAT CAAATAAAGA TTCACAACGT TGTAACTCTC 8760 

CAATTGCTCC ATAATAACGT GCTGTTTTTT CTTGGATGGCA TGCAATGGCA ATCACAGATT 8820 

TATTAAAACA TGTTGCCACT ACCCCAACAT GTAATTTACA AGTTAAAACC ACATCTACCA 8880 

TTTTCAACAA TGATGTCATT TCTGCAGGAG AATGATACTT GAATTGAAAA CAATCCTCAG 8940 

TTCTAACTAA TTTTCTAAAT TCCTGATAAT AAGCATCTTC ATAAGGTAGA ATGGAATCCG 9000 

AAGTTACTAC AACATAATAG TTAGGATTGT TTTCTAGAAA AAGACTAATT GATTCCGCAA 9060 

ATTTTTCAAG AGCTTTTTTG GAAT6ATTAT AGTGAACAAG AATTATCTTC TTATCTTTAG 9120 

CTTCTCTTTT CAATTGACAC AGCTGCTCTG TTTTTTCTTC TCTTAATTTA CTTGAAATAA 9180 

TTAAATCAAA GGTTTCATGC ACTGGAGCCG AAG6CGACAA ATGCTTCAAA GAATCAAATG 9240 



I 
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ATTCTCGATC ACGAACTGTA ATAAATTGAG CATGATTAAT AATTCTCTTT ATACCATAAT 9300 

TCATCAAAGA ATCGTTATTA GGCCCTGCAC C7UVTACCTAA TACTCCTATA GGCTTTTTAA 9360 

AATATGAAGC CCAAATTCCC AAAGGTAAAA ATCGTTTAAA TTGGATTAAA TTATCACGAA 9420 

AACGTGCATT ATGCCCTTCC CCAAAATATC CTCCCGGGAT ATACAAAATA GCATCTGCTT 9480 

GTTTTTTAGT AAAACTTTGT TTTTGGCGAT ATTCTTTCAA GTACATTTGA AAGAAATCT6 9540 

ATGGATTATA AAAAGAAACT TCATATCCTT TAGATTCTAA TAAATCATAG ACAATCTCAC 9600 

CGTAAAGATA ATCACCGTAA TTACTTGAAC CATAATCCGT TGCACCATGT AACATAATTT 9660 

TTTTCACCAC TATTTTTTCA ACCTCCTAAA AATAAATATC ATAATCAAAC TATACATAAT 9720 

AGGACGATAA ACATCTATTG AACTACTTCT CACTAAAAGC AATAGTTGAG AAATTACCGA 9780 

AAAATAAATA ACTTTTGAGA TTTTACTTGT TTGAAAAGCT CTGAAATTTA ATCGCCATCC 9840 

ACTAAATATT CCCAAAACAA AACTCCAAAA AACACCACCA TAGTAACCAA AGTTCCAAAA 9900 

TAATTCTTCC ACAAAAGAAG AGCCTACAGG TAACCCCAAA AATTTATTAA TAACAACCGT 9960 

CGCTGATGCT TTATCAAAAA AATCACCAAC TAACCATCCA ATAGGAAAAA TTGATAGGAT 10020 

AGTGCGTAGA AATGTCATCC CATATTCATA TGGAATGCTA CTAGGCACAA CAGTTACAGC 10080 

AGAAGCTACT GTTAGGCTGG TCAGTCCCGA CTCTGAAAAT ACTTCCCCTA GTATATTCTT 10140 

TACAAAATCT AATGAAGAAA AGGAATCAAA TAAGTATATA CCTATAGTAT TCAAGTCGAA 10200 

ACGGTGCCCC CTAATAACAA CTAATACATT TAATAGAAAT ACAGTTACTA TTAAAAATAC 10260 

AAGTACTCTT TTCTTCGAAA AAGTAATCCC TAAAGATTGT GTGTATACTA AAACCAACGC 10320 

CAAGATTGAA AACACCTGGA TTTTACGACT TCCTGTTAGG ATCATTATCA AAATTAGGTA 10380 

AAACAACATT ACCCAAAAAA TAGTACGCTT TATAACTCGG GACAGCTTAT CTGAATAAAA 10440 

CAAGGAGAAC ACACCAGGAA GCATAA6TAC TCCTAAATCA TCTATTATTC CTGAACTAGC 10500 

TGCCTCTGAA TATGCTGAAT AGCTATTCGC CGCTCTAACT GCTAGTACTG TTTTAGAATC 10560' 

AGTTATTACC CTAGAAATAA AGCCCACTCC TGTTAAAATC CTACCCGCAT TGTACAAAAT 10620 

TTTCTCTTCA TTTTCCTGAT AATTTTGTAC TTCTGAATGA TAATGTACCT TTCCATCACT 10680 

ATAAAAAAAT AAATAGCCTA CAGAATAACA AAACAAAATC CAAATTATAA AAATATATGA 10740 

ATGAAATAAT TCTTCATTAT TATAGAAGTT ACTAGGGCTC CACAGCAGAG TTGTTTGAAA 10800 

CCCCATATAC TCATTGAAAA TTAATCCAAA CATAAAAAAA TAAGATAAAA TCAGATACCA 10860 

TACAGAAAAA TCATATATAC TAACTTTTTG TAAAATAAAA CCAGTAATTT GAAAAATAAT 10920 

TA6AAAGCAA ACCCATATAA ATATAGACGG AACATAATTA GATATAAGAA AACCATTATT 10980 
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CCAATTATCG AGAGTCCAGA ACAAGTAACA GAAAGCAAAT ATAAAACTTA ATGTCACTAG 11040 

TGTCACTCTA CAAATATACT TTGTCTGCAT CTATATCTCC TTTATTACAC ACATTTCTTG 11100 

ATAACGATTC AATAATTTAC TAGCTTGATA ACAAATATCA TAGAGTCCAT CTGTCATACT. 11160 

GTTATTTATT TCAAAACGAT TGCATTCCTC AGATGTTAAA GACAGTACTT TATCTTTCCA 11220 

TAGCAACACA GACTCTTCGT TGATAGGTAA GTAACTAATG TTTTTGGTCA CATCTACTTC 11280 

TTGCGTCACT GTATCTGACG ATAAAATTTG TAATCCCGAT GCCTGAGCCT CTACTAGAGA 11340 

AACAGGCAAC CCCTCATATT TA6ACGGAAG CAAAAAAACA TCCATCGCAG ATAATAAATC 11400 

AGAAATATCA GTCCTTCTCC CTAAAAATAG CACATATGGG GTCAGATTTA GTTCTAAAGC 11460 

TTTCTGTTTT AATTTCTGCT CATCCTCACC ATTACCAACT AGGAGTAAAA TAACATTTGG 11520 

TTTGATTAAA ATGAGTTCTT TTAAAACGTT AAATAAATAA CTTTGGTTTT TTTGATCTGA 11580 

TAGGCGAGCT ATATTTCCTA ATACGAACTT ATTTGACACA TCTAATTCTC TACGACATTT 11640 

TTCTCTAACA TCTGACAAAA ATTGATACTT TTTCAAATCA ATTGCATTAA AAATAATTTC 11700 

AATTTTTCCG TCTTTATACG CTTTCTCTCC ATATAACCAC TTAGCCGAAT CTTCCCCACA 11760 

TGCAAACCAA TGAGTTGCTA AGATTTTTAC CAAAATTGTT ACTAATTTAC GCAATACTTT 11820 

TTGAAAACTG TTTTCTGTTA CATAAGCCAT ATGACTATGA ATAATTCTAA TTTTACAACC 11880 

AATTATTTTA GATAAGATCA GACCAATT6C AGATTTATAG CXy^TGGCAAT GAACTATATC 11940 

ATAATCTCCT TTCTTTATTA TTCTAGCAAG AGAGAGAAAC TGATGTAGAG GCTTTTTCCT 12000 

TAATAGAGGC ACATGATAAA CCTTTGCACC CAATTCTTTC ATTTTATCCT CTAAAAATCC 12060 

TTGTTCTTTT CCAGGCACAA TAAAATCAAA TTGAATTTTT TTTCTATCAA TGTGAGAATA 12120 

ATAGTTGAAT AGAAAACTTT CTACTCCACC ACTATCTAGT GTTGTAAATA GATGTAATAC 12180 

TTTAATCATT CTTCTTCCTT AAGCTTAAGA TTCGCTTCTC TAATTCTATT TCTGTTTTTT 12240 

GTTTTTCTAA ACTAATTCTG TCCATGTVAGT TATCACAATT CTTAATTAGC TGTTTCCTGT 12300 

CAAGGTTTTG AATATACAAA GCCAAACAAT CTTTTTCCGA TTCATCCTTC ATAGGTAAAA 12360 

CGAAACCAAA ACCATTCTCT ATTGACACTT TTTCCATATA AGTATCTTCA CAAACTAAAA 12420 

TAGGTTTATA CAACAATGCA GCAAAGTA6A GTTTATTAGA CAAAGCATAG TCTAGTAAGG 12480 

6AGTGTGATT CCCGTATAAA TTCAAAACAA CATCTGTATT CTTATAAAAA GACATGGTAT 12540 

CTTTAGGCTG GAATGTGTCC ACCAAGTTAA CATTGCTGAT ATTTTTTTCT TGACAAAATT 12600 

CCCTTAATTC TCCTGCATTA GTACCTATAA AATTCAACTG AAATCGACTG TCATTTGCAA 12660 

AAAAATCGAT TATTTTTTTA TTTTGTTCTT GAAAACGAAT TAAACCAATG TAGGAAAGTT 12720 

GAATTGGAAA CGTACTATTA TTTTTTAACT GCTTTACCTC GTTTAATTCT ATCATATTGG 12780 
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GTAGGTTATG 


GGTAGTAAAA 


TACTCTCCCA 


TTGGTAAAAA 


AAATTTATAG 


CCGTCTGAAG 


12840 


AAACGATATT 


CATTAAAGAA 


TTTTTCACCA 


ATTGTTTCTG 


AACCAAACGA 


TAAACCAAAA 


12900 


ATTTTTCATA 


ACTGTAATCA 


CGAATATCAT 


AAATATATCT 


ATTTTTAAAT 


GAAAAGAGAA 


12960 


GAAAATCTAC TAAAATGAAA GACACAATAC 


TATGTAACGG 


CAATATCATA 


TCATAATCAT 


13020 


TTTCTTTTAG 


CTTCTTTTTA 


ATTTCTTTTC 


TGAATTTTAC 


ATAACCTAAT 


ATCTTACTTA 


13080 


ATTTTCCTTT 


ACCAGAAAAA 


GAAATACGAT 


AGTAGTTTTG 


TTTTGTAATA 


ATCTCGTTAA 


13140 


TATTCTTATC 


CCAATATATA 


ACATCGTAAC 


TAATAGACAG 


TTTCTTCAAT 


AATTCTTTAT 


13200 


AAAAATTGAA GTAAGGAGTT 


AGATATATAT 


TATCAGATAG 


TATAAACAGT 


ACTCTCATTA 


13260 


AATTATTCTT 


TCTTACTTTC 


CCTCTCTAAA 


CATGTCTCCA 


GTTCGAGCAT 


AAACTGCTCT 


13320 


TTTGAAAAGT 


GATTTTCATA 


GTAACAACGA 


GCTTTCTTTC 


CTAACTCTCT 


TTGTCTCTTA 


13380 


ATAGATAACA 


TACTAAATTT 


ACAAATATTT 


TTTGCCAATT 


GTTTTACATC 


TCGTTCGGGA 


13440 


CTAACATATC 


CACAATTTGC 


TTCTTCTACA 


ATTATTTTAG 


CATCTCCTGA 


AATTGCACCT 


13500 


ATAATTGGTT 


TGCCTGCCGC 


CATATAAGAk 


TGTACCTTCC 


CAGGTATAGT 


ACGAGAAACT 


13560 


ATCGAGTCTC 


CTATTAAAGA 


AACTAACATA 


GCATCTGATT 


TTTTATAGAA 


GGATGGCATT 


13620 


TCCTCCAAAG 


AACGTCTTCC 


ATAGAAGGAA 


ATATTCTTTA 


ACTCCAATTC 


ATGAGCTAAT 


13680 


GCTTTCATGC 


TTAACAATTC 


CGTACCATCT 


CCAACAAAAT 


GAAAATGAAT 


TTTCTTGGGT 


13740 


AAATTGGTAT 


TCTTCTCTAT 


CAAACTGGCA 


GCTTTCAAAA 


TAGTTTCCAA 


ATTTTGTGCT 


13800 


TTGCCAATAT 


TACCAGCAAA 


AGTTAGGTCA ACACTTTCTT 


TATTAACTAT 


AGATTCATCA 


13860 


GGGATAAAAA 


GATCTTCTGC 


ATATTGTGGC 


AAATATGTAA 


TCTTTTGTTC 


GGATATGTCA 


13920 


AATTGCTTCA 


CAAAATAATT 


TTTAAATGAT 


GGACTAGTGA 


CAAATATATA 


ATCACTAGCT 


13980 


CGGTAAACTT 


TTTTTGAGAT 


AAATTTAAAC 


AGCTTGAAAA 




1 J> U A X A 1 


14040 


CCACCTACGG 


TTAAACTATC 


TGGCCAAACA 


TCCATACAAT 


ATAGAAACAT 


CGGTTTCTTA 


14100 


TATTTTTTTT 


TATAAGCCAT 


ACCAGCCCAT 


GCCATCATAA 


CTGGAGACAA 


TTGGTTAACG 


14160 


AATACACAGT 


CAAAATTCGA 


TCCATCTTTC 


GTTTTATACC 


TCCCCAATAA 


AACTCCTAAA 


14220 


GTAGAACTAA 


TTGCAAAGCT 


AAAATAATTC 


AACAATCXSAA 


ATACAACACT 


TTTTTTTCTA 


14280 


GGGATTGTAT 


AAGAACGATA 


TATCGTAACA 


CCTTCTATAA 


TCTCACGTCT 


TTTTTTATTA 


14340 


TGACGATAAT 


CTGCATATAT 


CTTCCCTTCA 


GGGTAATTAG 


GAATCCCAGC 


CAAAACAGAG 


14400 


ACTTCATGCC 


CTTTTCGAAC 


TAAATCTTCA 


CAAATATCTG 


ACAACCTGAA 


TGGTTCTGGC 


14460 


TTATAATGTT 


GGCAAACAAA 


TAGTATTTTC 


ATTGTCCAAT 


TTAACTTTCT 


TTCTTACCAC 


14520 
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TACCCTCTAC AATACCTTTT CGTTTCAGTA CGTAAGGTAT TGTCTTAACT ATACATCTAA 14580 

TATCCATTAT CAAAGACAGA TGTTTAACAT AGTAGCCATC TAACTCCGTC TTCATCTCAA 14640 

CAGACAAAGT ATCACGCCCG TTAATTTGTG CCCATCCAGT TAACCCTGGC AAGATATCAT 14700 

TTGCTCCATA CTTATCTCTC TCTGCAATCA AATCTAGTTC ATTTATACCC GCTGGTCTAG 14760 

GACCTACAAT ACTCATATTA CCAACAAGAA TATTAAACAA TTGTGGTAGT TCATCCAAAG 14820 

ATGTTTTTCG CAAGAAAGCC CCTACTTTTG TAATCyATTG CTCTGGATTA TATAAGTTTC 14880 

GAGGCGCCAC ATTTTTAGGT GCATCTATTT TCATAGACCT AAATTTCAAA ATATAGAAGT 14940 

ATTCTTTATG AATACCAAAG CGTTTTTGCT TAAATATAAC CGGACCTTCT 6AATCAAGTT 15000 

TAATCGCAAT TGCAATTATC ATAAAAACCG GACACAATAT TATTATCCCT ATTAAAGATA 15060 

ATAATATATC ACCTAATCGT TTTATTATAC CGTACATAAA CAACCTCCAA CTATAAATTC 15120 

TATTTCCATT TTTCATTCTA TTTCCATTTG ACAAATTAAA TCAGGCAGTA CATGCAACTA 15180 

CAGAAACTCA ATATATATTT GGTCACTCAA TGATTTTCAG AAATATAATT CTTTTATCCT 15240 

CTACGTCAGA TAAAACTTTT CTCCATCTAA ACAAAATTTA TTTGTTTCAG TAATATATGA 15300 

GTTCTCAATA ATGAATTAGA AGGTCCAGTT CAATTATTCT TCCAAATAGA CCGAATATTA 15360 

TTTGAAGACA TATCGGTTTC TGAAATTGCA ATCAGTACAT AAGCTAATAA ACTGATAAGT 15420 

ATGCTCTGTA AGAATGCCAG AGTTATATTG TAGTCCCCTT CCATACTATA TTCATTTTAT 15480 

TTTTTACCAT AATTTCCATA GGAACCGTAA ACTCCATACT TATTAACCGA GATATCCAAT 15540 

TTATTTAAAA CAACTCCTAG GAACAGTTTC CCTGTTTGTT TTAATTGTTG TTTCGCTTTT 15600 

TGGATATCAC GTTTATTCGC CTCACCTGTT GCTGTTACCA AGATGGACGC ATCACACTTT 15660 

TGAGTGATAA TTGCCGCATC AATAACAATT CCAATAGGCG GTGTATCAAT AATGATATAA 15720 

TCAAAATATT TACGCAATGT TTCAATCATA TCATTAAAAT TTTTACTTTG TAACAAGGCT 15780 

GTAGGGTTTG GTGATACAGA TCCCGATTGA ACTACAAATA AATTTTCAAT ATTTGTATCA 15840 

CATAAACCGT GAGATAAATC AGCTGTCCCA GATAAAAATT CTGTTAGCCC TGTAATTTPT 15900 

TCACGAGATT TAAAAACTCC TAACATAACT GAATTTCGAG TATCGCCATC GATCAAAAGA .1.5960 

GTTTTATAGC CTGCACGCGC AAACGACCAT GCTATATTTA TGGAAGTAGT TGTTTTTCCT 16020 

TCCCCAGGGT TAACAGAAGT AACGGAAATT ACTTTTAGTT TATCTCCGCT CAACTGTATA 16080 

TTTGTACACA AGGCATTGTA ATATTCTTCT GCCTTCTTAA TGAACTCCAG TTTTTTTTGT 16140 

GCTATTTCTA ATGTCGGCAT CCTTCTCTCC TATTTCAACT TACCCAAGTT TGGCACAACT 16200 

CCCAAAAGTG TCATCTGCAA TGTATTTTCG ATATCTTCCG GACGTTTCAC ACGAGTATCC 16260 

AAAAGTTCAA GATGAAGAAC TATAACACTA GTTCCAATCA CCCCTGCCAA AAAACCAATT 16320 
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AGTGTATTGC 


GTTTAATATT 


TGGCGAAGAC 


GGGGATATCG 


CCGGCCTTGC 


CTCCTCCAGT 


16380 


GTTGTCACGT 


CAGAAACACG 


AGTAATACTG 


ATAATTTTTT 


GAGCAGCTAC 


TTCTCTCAAA 


16440 


GAGTTAGCGA 


TACGGCTTGC 


CTCTTCAGGA 


ACTCGATCAT 


TAACTGAAAT 


AGAGACAATA 


16500 


CGGGTATCAA 


CTGGTACTGT 


CACTTTAATT 


TTATTAGCCA 


AACCTTTTGG 


CGTCAAATCT 


16560 


AGTTTCAAAT 


CAGAAACAAC 


TTCCTCCAAA 


ACATCCTGCG 


AAAGGATAAT 


CTCACGGTAG 


16620 


TCTTTTACCA 


GATAAGTTCC 


TGCCTGCAAA 


TCCTGATTTG 


TCAACCCCGG 


CTTGTCTCCT 


16680 


TGATTGCGAT 


TCACTACCTA 


AATTCGCGTG 


GTACTCGTAT 


ATTCTGGCTT 


AACAATAAAA 


16740 


GTGCTATATG 


CAAAAGCCCC 


CGCACCTGTC 


ACAAGT6CCA 


CTATTAAAAT 


CATTAGCTTG 


16800 


CGTTTCCACA 


AGCTTTTAAC 


TAATTGAAAT ACATCGATTT 


CTATCGTATT 


TTGTTCTTTC 


16860 


ATCATTTCTC 


CTAAATTAGT 


TGATCCATTA 


CAATTTTTCG 


AGGATTGTCT 


ATAAAAAGTT 


16920 


CCTGAGCCTT 


CGCTTCTCCG 


TATTTTTGGG 


TAACAA6GTC 


ATATGCTTCT 


GCCATATGAG 


16980 


GAGGTCTACC 


GTCTAGATTG 


TGCATATCAC 


TTGC/IATGAC 


ATGAACCAAA 


TCCTGCTCTA 


17040 


AAAAATACTG 


AGCTCTTTTT 


TTCATGAATT 


TATAACGTTC 


GCCAAAAAGT 


TTGGGTTTGA 


17100 


GGACATGTGA 


ACTATTTACT 


TGCGTGTAAC 


AGCCCATATC 


GATCAGTTCT 


CGAACGCGTT 


17160 


TTTCATTATT 


TTCAAGAGCA 


TCATAGCGCT 


CAATGTGGGC 


AATGACTGGA 


GTAATTCCCA 


17220 


ACATCAAGAT 


CTTGCTCAAG 


GCGCTATGAA 


TATCGCGATA 


AGGAGTGTTC 


ATACTAAACT 


17280 


CTATCAAGGC 


ATAACGACTA 


TCATTGAGGG 


TCGGAATCCX5 


CTTTTTTTCC 


AGCTTATCCA 


17340 


GAACATCTGG 


TGTGTAATAA 


ATTTCAGCCC 


CGTAAGCAAT 


GACCAAGTCA 


CTCGCCACTT 


17400 


CCTTAGCTAT 


TTCCCGAACC 


TGAAGAAAGT 


TTTCTGCTAT 


CTTCTCTTCC 


GGAGTTTCAA 


17460 


ACATGCCCTT 


GCGACGGTGA 


GAGGTAGAAA 


CAATGGTTCG 


CACCCCCTGT 


CTGTAGGATT 


17520 






TCCTCTCTTG 


ACTTGGGACC 


GTCATCTACA TCAAAAACGA 




TATGCGAATG 


GATGTCTATC 


ATTTCATCTA 


CCCTCCATCA 


CATCCTGTAT 


AGCTGCTTTA 


17640 


ACTACAGCTA 


AACTACTATC 


ATCTATTTCC 


ATCACATAGA 


GGTTACTGTC 


TGGCATTGCA 


17700 


TAAGAAGGAA 


GATCCATCCG 


ACCTGTCCCT 


TTTAAATCTT 


GAGAATTTAC 


TTTATAATTC 


17760 


CCTCCACTTT 


CTAACTGAGC 


ATTGACCAAA 


TTTATCATGG 


TCTCAAGTGG 


CATATTTGTT 


17820 


TGGATAGAAT 


CTTGCAAGCT 


ATTAATGATC 


GTACTATAAT 


TTTTCAGCAC 


TTCGGTTGAC 


17880 


GTTAATTTTT 


6AAGGATAGC 


CACAATCACC 




GGCGCCCGCG GTCACGATCG 


17940 


CCATCTGCTA 


GGGAGTAGCG 


CTCACX3AACA 


AAACCGAGAG 


CCTGTTCTGA ATCAAGATGA 


18000 


ACATTGCCTG 


CAGGGTAATA 


CTTTCCATTC 


GTATGGGCAG 


TAAATTCTTG 


ATCATTATAA 


18060 
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ACATCAATTC 


CACCCAACAA ATCAATCAAT 


TTCAAAAACG 


AAGTGAAGTT 


CAATCGCACA 


18120 


TAGTAATTGA 


TATCCACTCC 


ATAGAGATTT 


TCTAAGGTGT 


GAATGGACGA 


ATCAACTCCA 


18180 


TAAATGCCCG 


CATGAGTCAA 


TTTATCTTTT 


TGATTATTTC 


CACCATCTGC GATTGGTACA 


18240 


TAGGCATCAC 


GTGGCGTTGT 


GGTCAAGAGG 


ATTTTCTTGG 


TATCTCGATT 


GACAGTCATC 


18300 


AGGATGTTGA 


CATCTGATCG 


CGACACCGAA 


CTAATAGGAC 


CATAGGTGTC 


AATTCCACTA 


18360 


ACATAGATAT 


TGAAAGACTG 


ACTCTTAGAC 


GTCTTAGGAG 


CTTCTACTTT 


TTTAGTGAAT 


18420 


CCCTTAGTAT 


AAATCTTTTT 


TATCTTCXSAT 


GCGTAGTCTG 


GATACTCTGA 


CTCGATGATG 


18480 


TTTTCAAAGA 


CACTATTTAG 


GACAATGGCC 


TTAGTCTCCC 


CTGCAATCAA ACTCTTGTAA 


18540 


GCTGCCAAGT 


AAGACGAACT 


CTGGTTGACC 


GTCAAATCGG 


TATTCTGACT 


TGACTTGATA 


18600 


TCAGCTAGTA 


ATTTCTGAAT 


ATTTTCATTA 


TTAGTCCCAG 


TCGGTGCTGT 


CACACTCGTC 


18660 


AGTTGCGTAA 


CATTTTCX5AT 


CTCACTATCT 


GCTAAAACAG 


CGACACTGAT 


TGAATATTCT 


18720 


GAGTAATTAG 


AAGTCGCATT 


TAAACGATTG 


GTCAGTCCAA 


CAAACTGCTG 


TACTGCAAA6 


18760 


AGCGACACAG 


AGCTGACAAG 


GATAGAGAAC 


ACCAACAGAA 


AAATAGTAAA 


CTTTTCAGCT 


18840 


TTTTTATAGA 


TAATCAAGAG 


TAGCCCTACC 


AAGGCAACTA 


GTAGGACTAA 


CGCAGTTACC 


18900 


ACTAGATTAA 


GATATCTAAA 


AGCAAGGATA 


TTGTACTTAA 


AGATTAAGAA 


CAATAAAAAA 


18960 


CAAACTAACA 


ATAAATAAAT 


A6TCAGCAAA ACTATATTAA 


CACTTCGCTT 


CACTTTCTGT 


19020 


GAAC6TGATT 


TTTTAAAACG 


TCTACTCATG ATTAATACCT 


ATACATTGAA 


CATTATACGA 


19080 


TTATATCACT 


TTTTTACGGT 


AATGTCTACA 


CCTTTATTTT 


TACTATCTGC 


ATCTTTAAGT 


19140 


ATCTTAGTAG 


ACTTCCCGCG 


AAACAAAAAT ATAGTAAAAT 


GAAATAAGAA 


CAGAACAAAT 


19200 


CGTTCAGGAC 


AGTCAAATCG ATTTCTAACA ATGTTTTAGA 


AGCAGAGGTG 




19250 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21706 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

AAAGTTGAAA GACT6CTAGC TGTTTTTGAT ACCAATCGTT TCCAACTACA GAGCAAACAG 60 

TATACAAAGT TTGTTTTTGG ATGTAAGCTT CTTGATGGAC AATTCCAAGA AAATCAAGAA 120 

ATTGCTGACC TTCAATTTTT TGCCATTGAC CAACTGCCGA ACTTATCTGA AAAACGCATT 180 

ACCAAGGAGC AAATAGAGCT TCTTTGGCAG GTTTATCAAG GTCATAGG66 GCAATATCTT 240 
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GACTAAGAAG ATGATTATCG TATTTCTAAA TCCATTTTTA ACAACTAGCA TGGTATAATA 300 

ATATGCAGGA AAATTTTGAA TTATGAGGAA GACTAGATGA ATTTATGGGA TATTTTCTTT 360 

ACGACTCAGG CAACCGAGCC GCCCAAATTT GACCTTTTTT GGTATGTTAG CCTATTTACG 420 

CTCTTAGCCT TAACCTTTTA TACAGCCCAT CGCTATCGTG AAAAGAAGGT TTACCAACGA 480 

TTTTTCCAAA TCTTGCAGAC TGTTCAGTTA ATCCTTCTTT ATGGTTGGTA CTGGGTCAAT 54 0 

CATATGCCAC TGTCAGAAAG CCTACCCTTT TACCATTGCC GTATGGCTAT GTTTGTGGTA 600 

CTCTTGCTTC CTGGTCAATC CAAATATAAA CAATACTTTG CATTATTGGG AACATTTGGG 660 

ACATTAGCAG CCTTTGTTTA TCCAGTGCCA GATGCTTACC CTTTTCCACA TATCACCATT 720 

CTATCCTTTA TCTTTGGTCA TTTAGCACTC TTGGGGAACT CTCTAGTTTA TCTATTGAGA 780 

CAGTATAATG CGCGATTGCT GGATGTGAAG GGAATTTTTC TCATGACCTT TGCCCTAAAT 840 

GCCTTGATTT TTGTGGTCAA TTTGGTGACA GGTGGCX^ATT ACGGATTTTT GACAAAACCX3 900 

CCATTGGTTG GGGATCACGG TCTAGTAGCT AATTATTTAC TTGTTTCAAT TGTGCTGGTA 960 

GCTACTATCA GTTTGACTAA GAAAATCTTA GAATTCTTTT TAGCTCAAGA AGCAGAAAAA 1020 

ATGATTGCAA AGGAAGCTTA ACACAGAGCT TTCTTTTTTG CTCTTAGAGA GTTTTTACAA 1080 

GCAGCTTATA AAATAAGAAT TTCTGAATAG ACAAACTCAA AAAATGGCTG GGA7ATTTAG 1140 

GAAAAAAGCA AGCACGATTA AATTTTTTGT GTTATAATAT TTTGTGAATA GCTATGCCTA 1200 

TGTTTAGCTA TGGAATAATA CGAAGTGCGA AACTTGGAAG ATAGAGAGGA AGCGATGTAA 1260 

TGGCTAGAGA AGGCTTTTTT ACAGGTCTAG ATATTGGAAC AAGCTCTGTC AAGGTGCTTG 1320 

TGGCCGAGCA GAGAAATGGT GAATTAAATG TAATTGGCGT GAGTAATGCC AAAAGTAAAG 1380 

GTGTAAAGGA TGGAATTATT GTTOATATTG ATGCAGCA6C AACTGCTATC AAGTCAGCCA 1440 

TTTCCCAAGC GGAAGAAAAG GCAGGCATTT CGATTAAATC AGTGAATGTC GGCTTGCCTG 1500 

GTAATCTTTT GCAGGTAGAA CCAACTCAGG GGATGATTCC AGTAACATCT GATACTAAGG 1560 

AAATTACGGA TCAAGATGTT GAAAATGTTG TCAAATCAGC TTTGACAAAG AGTATGACAC 1620 

CTGACCGTGA AGTCATTACC TTTATTCCTG AAGAATTTAT TGTGGATGGT TTCCAAGGGA 1680 

TTCGTGACCC ACGTGGCATG ATGGGGGTTC GCCTTGAAAT GCGTGGTTTG CTTTATACAG 1740 

GACCTCGTAC TATCTTGCAC AATTTGCGTA AGACGGTTGA GCGTGCAGGT GTTCAGGTTG 1800 

AAAATGTTAT CATTTCACCA CTAGCAATGG TTCAGTCTGT TTTGAACGAft GGGGAACGTG I860 

AATTTGGTGC TACAGT6ATT GATATGGGGG CAGGTCAAAC GACTGTCX5CT ACAATCCGTA 1920 

ATCAAGAACT CCAGTTCACA CATATTCTCC AAGAAGGTGG AGATTATGTA ACTAAAGATA 1980 
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TCTCCAAGGT 


TTTGAAAACC 


TCTCGCAAAT TAGCGGAAGG CTTGAAACTG 


AATTACGGGG 


2040 


AAGCCTATCC 


GCCTCTTGCA 


AGCAAAGAAA CCTTCCAAGT AGAGGTTATT 


GGAGAAGTAG 


2100 


AAGCAGTCGA 


AGTGACGGAA 


GCCTACTTGT CAGAAATTAT TTCTGCACGA 


ATCAAGCACA 


2160 


TCCTTGAACA 


AATCAAGCAA 


GAATTAGATA GAAGGCGTCT ATTGGACCTC 


CCTGGTGGTA 


2220 


TTGTCTTAAT 


CGGTGGGAAT 


GCCATTTTAC CAGGTATGGT TGAGCTTGCT 


CAGGAAGTCT 


2280 


TTGGCGTCCG 


TGTCAAGCTT 


TATGTTCCAA ATCAAGTTGG TATCCGTAAT 


CCAGCCTTTG 


2340 


CGCATGTGAT 


TA6TTTATCA 


GAATTTGCGG GTCAATTAAC AGAAGTTAAT 


CTTTTGGCTC 


2400 


AGGGAGCGAT 


AAAAGGTGAG 


AATGACTTAA GTCATCAGCC AATTAGTTTT 


GGTGGGATGC 


2460 


TGCAAAAAAC 


AGCTCAGTTT 


GTACAATCAA CGCCTGTTCA ACCAGCTCCT 


GCTCCAGAAG 


2520 


TAGAGCCGGT 


GGCGCCTACA 


GAACCAATGG CGGATTTCCA ACAAGCTTCA 


CAAAATAAAC 


2580 


CGAAATTAGC 


AGATC6TTTC 


CGT6GATTGA TCGGAAGCAT GTTTGACXa^ 


TAAAGAGGAA 


2640 


AAATAAATTA 


TGACATTTTC 


ATTTGATACA GCTGCTGCTC AAGGGGCAGT 


GATTAAAGTA 


2700 


ATTGGTGTCG 


GTGGAGGTGG 


TGGCAATGCC ATCAACCGTA TGGTCGACGA 


AGGTGTTACA 


2760 


GGCGTAGAAT 


TTATCGCAGC 


AAACACAGAT GTACAAGCAT TGAGTAGTAC 


AAAAGCTGAG 


2820 


ACTGTTATTC 


AGTTGGGACC 


TAAATTGACT CGTGGTTTGG GTGCAGGAGG 


TCAACCTGAG 


2880 


GTTGGTCGTA AAGCCGCTGA AGAAAGCGAA GAAACACTGA CGGAAGCTAT 


TAGTGGTGCC 


2940 


GATATGGTCT 


TCATCACTGC 


TGGTATGGGA GGAGGCTCTG GAACTGGAGC 


TGCTCCTGTT 


3000 


ATTGCTCGTA 


TCGCCAAAGA 


TTTAG6TGCG CTTACAGTTG GTGTTGTAAC 


ACGTCCCTTT 


3060 


GGTTTTGAAG 


GAAGTAAGC6 


TGGACAATTT GCTGTAGAAG GAATCAATCA 


ACTTCGTGAG 


3120 


CATGTAGACA 


CTCTATTGAT 


TATCTCAAAC AACAATTTGC TTGAAATTGT 


TGATAAGAAA 


3180 


ACACCGCTTT 


TGGAGGCTCT 


TAGCGAAGCG GATAACGTTC TTCGTCAAGG 


TGTTCAAGGG 


3240 


ATTACCGATT 


TGATTACCAA 


TCCAGGATTG ATTAACCTTG ACTTTGCCGA 


TGTGAAAACG 


3300 


GTAATGGCAA 


ACAAAGGGAA 


TGCTCTTATG GGTATTGGTA TCGGTAGTGG 


AGAAGAACCT 


3360 


GTGGTAGAAG 


CGGCACGTAA 


GGCAATCTAT TCACCACTTC TTGAAACAAC 


TATTGACGGT 


3420 


GCTGAGGATG 


TTATCGTCAA 


CCTTACTGGT GGTCTTGACT TAACCTTGAT 


TGAGGCAGAA 


3480 


GAGGCTTCAC 


AAATTGTGAA 


CCAG6CAGCA GGTCAAGGAG TGAACATCTG 


GCTCGGTACT 


3540 


TCAATTGATG AAAGTATGCG TGATGAAATT CGTGTAACAG TTGTTGCAAC GGGTGTTCGT 


3600 


CAAGACCGCG 


TAGAAAAG6T 


TGTGGCTCCA CAAGCTAGAT CTGCTACTAA CTACCGTGAG 


3660 


ACAGTGAAAC 


CAGCTCATTC ACATGGCTTT GATCGTCATT TTGATATGGC AGAAACAOTT 


3720 


GAATTGCCAA AACAAAATCC ACGTCGTTTG GAACCAACTC AGGCATCTGC TTTTGGTGAT 


3780 
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TGGGATCTTC GCCGTGAATC GATTGTTCGT ACAACAGATT CAGTCGTTTC TCCAGTCGAG 3840 

CGCTTTGAAG CCCCAATTTC ACAAGATGAA GATGAATTGG ATACACCTCC ATTTTTCAAA 3900 

AATCGTTAAG TAAATGAATG TAAAAGAAAA TACAGAACTT GTTTTTCGAG AAGTTGCAGA 3960 

GGCTAGTCTG AGTGCTCATC GAGAGAGTGG TTCGGTCTCT GTCATTGCAG TTACCAAGTA 4020 

TGTAGATGTA CCGACAGCGG AAGCCTTGCT TCCGCTAGGT GTCCATCATA TCGGTGAAAA 4080 

TCGTGTAGAT AAGTTTCTGG AAAAATATGA AGCTTTAAAA GATCGAGATG TGACTTGGCA 4140 

TTTGATTGGT ACCTTGCAAA GACGTAAGGT GAAAGATGTC ATTCAATACG TTGATTATTT 4200 

CCATGCATTG GACTCAGTAA AGCTAGCAGG GGAAATTCAA AAAAGAAGTG ACCGAGTCAT 4260 

CAAGTGTTTC CTTCAAGTAA ATATTTCTAA AGAAGAAAGC AAACACGGTT TTTCGAGAGA 4320 

GGAACTGCT6 GAAATCTTGC CAGAGTTAGC CApACTAGAT AAGATTGAAT ATGTTGGTTT 4380 

AATGACGATG GCACCTTTTG AGGCTAGCAG TGAGCAGTTG AAAGAGATTT TCAAG6CGGC 4440 

CCAAGATTTA CAAAGAGAAA TTCAAGAGAA ACAAATTCCA AATATGCCTA TGACCGAGTT 4500 

AAGTATGGGA ATGAGTC6TG ATTATAAAGA AGCGATTCAA TTCGGTTCCA CTTTTGTTCG 4560 

TATAGGTACA TCATTTTTTA AGTAGGAGAG AACCATGTCT TTAAAAGATA GATTCGATAG 4620 

ATTTATAGAT TATTTTACGG AGGATGAGGA TTCAAGTCTC CCTTATGT^ AAAGAGATGA 4680 

GCCTGTGTTT ACTTCAGTAA ATTCTTCACA GGAACCGGCT CTCCCAATGA ATCAACCTTC 4740 

ACAGTCGGCT GGCACAAAAG AGAACAATAT CACCAGACTT CATGCAAGAC AACAGGAATT 4800 

GGCAAATCAG AGTCAGCGTG CAACGGATAA GGTCATTATA GATGTTCGTT ATCCTAGAAA 4860 

ATATGAGGAT GCAACAGAAA TTGTTGATTT ATTGGCAGGA AAC6AAAGTA TCTT6ATTGA 4920 

TTTTCAGTAT ATGACAGAGG TGCAGGCTCG TCGTTGTTTG GACTATTTGG ATGGAGCTTG 4980 

TCATGTTTTA GCTGGAAATT TGAAAAAGGT AGCTTCTACC ATGTATTTGT TGACACCAGT 5040 

GAACGTTATT GTAAATGTTG AAGATATCCG TTTACCAGAT GAAGATCAAC AGGGTGAGTT 5100 

CGGTTTTGAT ATGAAGCGAA ATAGAGTACG ATAATGATTT TTTTAATTCG TATGATTTAT 5160 

AATGCAGTGG ATATTTACTC CCTGATTTTG GTAGCCTTCG CTGTCATGTC TTGGTTTCCA 5220 

GGTGCCTACG AATCCAGTTT AGGTCGTTGG ATTGTAGCGT TGGTGAAACC AGTGCTTGCT 5280 

CCCTTGCAAC GCCTGCCTTT ACAGATAGCG GGTCTTGATT TATCTGTTTG GGTTGCGATT 5340 

GTTTTGGTTC GATTTTTAGG AGAAAACCTA GTGCGTTTTC TGGC6ATGAT AGGATGAATA 5400 

AAGGGATTTA TCAGCATTTC TCCATAGAAG ATCGTCCATT TCTTGACAAG GGAATGGAAT 5460 

GGATAAAGAA GGTAGAAGAT AGCTATGCTC CTTTTTTAAC TCCTTTTATC AATCCTCATC 5520 
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AGGAGAAGCT ATTAAAGATT 


TTGGCCAAAA 


CCTATGGTCT 


TGCTTGTAGC AGTAGTGGGG 


5580 


AATTCGTCTC GAGTGAGTAT 


GTTCGAGTTT 


TATTATACCC 


AGATTATTTC CAACCAGAGT 


5640 


TTTCAGATTT TGAAATATCT 


CTCCAGGAAA 


TTGTGTATTC 


CAATAAATTT GAACATTTAA 


5700 


CGCATGCTAA GATTTTAGGG 


ACAGTCATCA 


ATCAATTAGG 


GATTGAACGG AAACTTTTTG 


5760 


GAGATATCCT AGTAGATGAA 


GAACGGGCGC 


AGATTATGAT 


TAATCAGCAG TTTCTTCTTC 


5820 


TGTTTCAAGA TGGACTAAAG 


AAAATTGGTC 


GTATACCTGT 


TTCGCTGGAG GAACGTCCTT 


5860 


TCACCGAGAA AATAGATAAG 


CTAGAACAGT 


ATCGAGAACT 


GGATTTATCT GTGTCTAGTT 


5940 


TTCGATTAGA TGTTCTTTTA 


TCAAATGTTT 


TGAAACTATC 


TAGGAATCAA GCAAACCAGT 


6000 


TGATTGAAAA GAAACTTGTC 


CAAGTAAATT ATCATGTGGT AGACAAATCA GATTACACTG 


6060 


TTCAAGTTGG AGACTTGATT 


AGTGTGAGAA 


AATTTGGTCG 


CTTGAGATTA CTTCAAGATA 


6120 


AGGGACAAAC GAAAAAAGAG 


AAGAAAAAAA 


TAACCGTCCA 


6TTATTATTA AGTAAGTGAG 


6180 


GAATAGAATG CCAATTACAT 


CATTAGAAAT 


AAAGGACAAG 


ACTTTTGGAA CTCGATTCAG 


6240 


AGGTTTTGAT CCAGAAGAAG 


TCGATGAATT 


TTTAGATATT 


GTGGTTCGTG ATTACGAAGA 


6300 


TCTTGTGCGT GCGAATCATG 


ATAAAAATTT 


GCGTATTAAG 


AGTTTAGAAG AGCGTTTGTC 


6360 


TTACTTTGAT GAAATAAAAG 


ATTCATTGAG 


CCAGTCTGTA 


TTGATTGCTC AGGATACAGC 


6420 


TGAGAGAGTG AAACAG6CGG 


CGCATGAACG 


TTCAAACAAT 


ATCATTCATC AAGCAGAGCA 


6480 


AGATGCGCAA CGCTTGTTGG 


AAGAAGCTAA ATATAAGGCA 


AACGAGATTC TTCGTCAAGC 


6540 


AACTGATAAT 6CTAA6AAAG 


TCGCTGTTGA 


AACAGAAGAA 


TTGAAGAACA AGAGCCGTGT 


6600 


CTTCCACCAA CGTCTCAAAT 


CTACAATTGA 


GAGTCAGTTG 


GCTATTGTTG AATCTTCAGA 


6660 


TTGGGAAGAT ATTCTCCGTC 


CAACAGCTAC 


TTATCTTCAA 


ACCAGTGATG AAGCCTTTAA 


6720 


AGAAGTGGTT AGCGAAGTAC 


TTGGAGAACC 


GATTCCAGCT 


CCAATTGAAG AAGAACCAAT 


6780 


TGATATGACA CX3TCAGTTCT 


CTCAAGCAGA 


AATGGCAGAA 


TTACAAGCTC GTATTGAGGT 


6840 


AGCCGATAAA GAATTGTCTG 


AATTTGAAGC 


TCAGATTAAA 


CAGGAAGTGG AAGCTCCAAC 


6900 


TCCTGTAGTG AGTCCTCAAG 


TTGAA6AAGA 


GCCTCTGCTC 


ATCCAGTTGG CCCAATGTAT 


6960 


GAAGAACCAG AAGTAGCTCC 


AATGCATCCG 


ATAGGTCCAA 


CACCAGCTAC AGAAACTGTT 


7020 


GATTCAATAC C6GGATTTGA 


AGCACCGCAA 


GAATCTGTTA 


CAATTTTATA AGAAATATTC 


7080 


TGAGAACAAT ATCTTATCCT 


TATATTTCCA 


GCGAGCAGGA 


GATGGT6TGA GTCCTGTAAT 


7140 


CCCTATTGAT AAGATTATCC 


TCTCAAAAAC 


TCAAGTCTGA 


AGCTAGTAAG ATTTGACGTT 


7200 


TCCCACX3TTA CGGGATAAGA 


GGGAGAAAGA 


CTAAATCTTT 


TTCCGAATAA AGGTGGTACC 


7260 


ACGATTTTCG TCCTTTTTGG 


AAGTCGTOGT 


TTTTAATTTG 


TTATTATTTA TAAAGGA6AT 


7320 
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ACCATGAAAC TCAAAGACAC CCTTAATCTT GGGAAAACTG AATTCCCAAT GCGTGCAGGC 7380 

CTTCCTACCA AAGAGCCAGT TTGGCAAAAG GAATGGGAAG ATGCAAAACT TTATCAACGT 7440 

CGTCAAGAAT TGAACCAAGG AAAACCTCAT TTCACCTTGC ATGATGGCCC TCCATACGCT 7500 

AACGGAAATA TCCACOTTGG ACATGCTATG AACAAGATTT CAAAAGATAT CATTGTTCGT 7560 

TCTAAGTCTA TGTCAGGATT TTACGCACCA TTTATTCCTG GTTGGGATAC TCATGGTCTG 7620 

CCAATCGAGC AAGTCTTGTC AAAACAAGGT GTCAAACGTA AAGAAATGGA CTTGGTTGAG 7680 

TACTTGAAAC TTTGCCGTGA GTACGCTCTT TCTCAAGTAG ATAAACAACG TGAAGATTTT 7740 

AAACGTTTGG GTGTTTCTGG TGACTGGGAA AATCCATATG TGACCTTGAC TCCTGACTAT 7800 

GAAGCAGCTC AAATTCGTGT ATTTGGTGAG ATGGCTAATA AGGGTTATAT CTACCGTGGT 7860 

GCTAAGCCAG TTTACTGGTC ATGGTCATCT GAGTCAGCAC TTGCTGAAGC AGAGATTGAA 7920 

TACCATGACT TGGTTTCAAC TTCCCTTTAC TATGCCAACA AGGTAAAAGA TGGCAAAGGA 7980 

GTTCTAGATA CAGATACTTA TATCGTTGTC TGGACAACGA CTCCATTTAC CATCACAGCT 8040 

TCTCGTGGTT TGACGGTTGG TGCAGATATT GATTACGTTT TGGTTCAACC TGCTGGTGAA 8100 

GCTCGTAAGT TTGTCGTTGC TGCTGAATTA TTGACTAGCT TGTCTGAGAA ATTTGGCTGG 8160 

GCTGATGTTC AAGTTTTGGA AACTTACCGT GGCCAAGAAC TCAACCACAT CGTAACAGAA 8220 

CACCCATGGG ATACAGCTGT AGAAGAGTTG GTAATTCTTG GTGACCACGT TACGACTGAC 8280 

TCTGGTACAG GTATTGTCCA TACAGCCCCT GGTTTTGGTG AGGACGATTA CAATGTTGGT 8340 

ATTGCTAATA ATCTTGAAGT CGCAGTGACT GTTGATGAAC GTGGTATCAT GATGAAGAAT 8400 

GCTGGTCCTG AATTTGAAGG TCAATTCTAT GAAAAGGTAG TTCCAACTGT TATTGAAAAA 8460 

CTTGGTAACC TCCTTCTTGC CCAAGAAGAA ATCTCTCACT CATATCCATT TGACTGGCGT 8520 

ACTAAGAAAC CAATCATCTG GCGTGCAGTT CCACAATGGT TTGCCTCAGT TTCTAAATTC 8580 

CGTCAAGAAA TCTTGGACGA AATTGAAAAA GTGAAATTCC ACTCAGAATG GGGTAAAGTC 8640 

CGTCTTTACA ATATGATCCG TGACCGTGGT GACTGGGTTA TCTCTCGTCA ACGTGCTTGG 8700 

GGTGTTCCAC TTCCTATCTT . CTACGCTGAA GATGGTACAG CTATCATGGT AGCTGAAACT 8760 

ATTGAACACG TAGCTCAACT TTTTGAAGAA TATGGTTCAA GCATTTGGTG GGAACGTGAT 8820 

GCCAAAGACC TCTTGCCAGA AGGATTTACT CATCCAGGTT CACCAAACGG CGAGTTCAAA 8880 

AAAGAAACTG ATATCATGGA CGTTTGGTTT GACTCAGGTT CATCATGGAA TGGAGTGGTG 8940 

GTAAACCGTC CTGAATTGAC TTACCCAGCC GACCTTTACC TAGAAGGTTC TGACCAATAC 9000 

CGTGGTTGGT TTAACTCATC ACTTATCACA TCTGTTGCCA ACCATGGCGT AGCACCTTAC 9060 
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AAACAAATCT TGTCACAAGG TTTTGCCCTT GATGGTAAAG GTGAGAAGAT GTCTAAATCT 9120 

CTTGGAAATA CTATTGCTCC AAGCGATGTT GAAAAACAAT TCGGTGCTGA AATCTTGCGT 9180 
CTCTGGGTAA CAAGTGTTGA CTCAAGCAAT GACGTGCGTA TCTCTATGGA TATCTTGAGC . 9240 

CAAGTTTCTG AAACTTACCG TAAGATTCGT AACACTCTTC GTTTCTTGAT TGCCAATACA 9300 

TCTGACTTTA ACCCAGCTCA AGATACAGTC GCTTACGATG AGCTTCGTTC AGTTGATAAG 9360 

TACATGACGA TTCGCTTTAA CCAGCTTGTC AAGACCATTC GTGATGCCTA TGCAGACTTT 9420 

GAATTCTTGA CGATCTACAA GGCCTTGGTG AACTTTATCA ACGTTGACTT GTCAGCCTTC 9480 

TACCTTGATT TTGCCAAAGA TGTTGTTTAC ATTGAAGGTG CCAAATCACl* GGAACGCCGT 9540 

CAAATGCAGA CTGTCTTCTA TGACATTCTT GTCAAAATCA CCAAACTCTT GACACCAATC 9600 

CTTCCTCACA CTGCGGAAGA AATCTGGTCA TATCTTGAGT TTGAAACAGA AGACTTCGTC 9660 

CAATTGTCAG AATTACCAGA AGTTCAAACT TTTGCTAACC AAGAAGAAAT CTTGGATACA 9720 

TGGGCAGCCT TCATGGACTT TCGTGGACAA GCACAAAAAG CCTTGGAAGA AGCTCGTAAT 9780 

GCAAAAGTTA TCGGTAAATC ACTTGAAGCA CACTTGACAG TTTATCCAAA TGAAGTTGTG 9840 

AAAACTCTAC TCGAAGCAGT AAACAGCAAT GTAGCACAAC TTTTGATCGT GTCTGAGTTG 9900 

ACCATCGCAG AAGGACCAGC TCCGGAAGCT GCCCTTAGCT TCGAAGATGT AGCCTTCACA 9960 

GTTGAACGTG CTACTGGTGA AGTATGTGAC CGTTGCCGTC GTATCGACCC AACAACAGCA 10020 

GAACGCAGCT ACCAGGCAGT TATCTGTGAC CACTGTGCAA GCATCGTAGA AGAAAACTTT 10080 

GCGGAAGCAG TCGCAGAAGG ATTTGAAGAG AAATAAGATT GAAAAGTCTA GGCAAAATTC 10140 

AATTTGAGAA GAAAAGACAA CTAATTTTAT AGTCTATTAA ACGCATTGTA TCACGTTTTT 10200 

GAATACCTGA TATGATGCGT TTTTTATTTA TTTTAAAAAT TTGCGAGGTA TGACTTTTTA 10260 

TACTCAACAA GAATCAAAGA GAAACTTAGC AAGCTAACAG TAGTAAGATA AAATAGGAAT 10320 

TTGATATTAG GGATAAGATT GGTAAATAGT GTAATATTTT TACAACAATA AATTTATATA 10380 

GTTATTTCTG GTTTCTGAAA AGTATTATAT TTTATTTCAT ATTATACAAA TTTTTATTTT 10440 

ATAATATCAG AACATACTTT TTTTAAAAGC AAATATGATA CAATTTTATT TGAAAAAAAT 10500 

AAAAAAGGAG ATTTTATTAT AAAATTAAAA AGACTTGCTT TAATTAGTGG TATCGTCGGT 10560 

CTTGTGGGAG 6AATTTTACT TCTTATTGGT CCTTTTGTCT TGTTGGGAAT AGCGGTAAAC 10620 

ACAGCTGCTA CAACTCTTAA TGGAG6AGCT ACTGCAGGGG CTTTTTCAGG TGTAGCCTTA 10680 

CTCTTGAATG CCTTGAAGAT TGCAAATCTT GTTCTTGGTA TCATTGCTAT TGTTTACTAT 10740 

AAAGGAGATA AGCGTGTAGG TGCAGCTCCG TCTGTACTAA TGATTGTTTC TGGTGGAGTT 10800 

AGTCTCATTC TATTCCGTTC TTAG6ATGGG TTGGGGGGAT TTTTGCTATT ATCGGAG6AT 10860 
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CTCTATTCCT TTCAACATTG AAGAAATTCA AATCAGAAGA ATAAAAGGTA TTTTAGCATG 10920 

AAAAGAACAA AAAAGTTTAT CGGTATAGGA GTAGCTCTAT TATCTCTTTC TCTTCTAGTT 10980 

GCAT6TGGAA CATAAAGTTC AAAGAATACT TCAACAAGTA ATGATGAGAA GACAGTAGCA 11040 

ACATCCAATA GTTCAAAAGA AACAATCACT TTCGATACAC CGGTTGTAAC AGACGATGCG 11100 

ATTGAATCAA TACGCACTTA TGCAGATTAT ATAGATCTTT ATAAAAATAT TTTTGATGAT 11160 

TATTTTACTA AAGCTGAGGA AGGTTTCAAA GGCATAGCTA TGGAAAATAA TGACTCGTTT 11220 

ACTAAACTAA AAGAGTCAAC TCAAAAATTA TTCGATGCGC AGAAAAAAAG GTTAAATAAT 11280 

GAAGATAGAA TAGAAACAAC CAAAAACAAT GTGATTGCCA AACATTGTCA AACAGTCCTT 11340 

TCCTTTTTGG TTTTGACTAG CTTTTTTGTG AAAAATTGTG TAAAATAGAA TAGATAAACG 11400 

AGGGGAAACC TCX3GAAAATT TAAAG6A6AA TCCATCTAAT GGTAAAATTG GTTTTTGCTC 11460 

GCCACGGTGA GTCTGAATGG AACAAAGCTA ACCTTTTCAC TGGTTGGGCT GATGTTGATT 11520 

TGTCTGAAAA AGGTACACAA CAAGCGATTG AC6CTGGTAA ATTGATCAAA GAAGCTGGTA 11580 
TCGAATTTGA CCAAGCTTAC ACTTCAGTAT TGAAACGTGC TATCAAAACA ACTAACTTGG . 11640 

CTCTTGAAGC TTCTGACCAA TTGT6GGTTC CAGTTGAAAA ATCATGGCGC TTGAACGAAC 11700 

GTCACTACGG TGGTTTGACT GGTAAAAACA AAGCTGAAGC TGCTGAACAA TTTGGTGATG 11760 

AGCAAGTTCA CATCTGGCGT CGTTCATACG ATGTATTGCC TCCAAACATG GACCGTGATG 11820 

ATGAGCACTC AGCTCACACA GACCGTCGTT ACGCTTCACT TGACGACTCA GTTATCCCAG 11880 

ATGCTGAAAA CTTGAAAGTG ACTTTGGAAC GTGCTCTTCC ATTCTGGGAA GATAAAATCG 11940 

CTCCAGCTCT TAAAGATGGT AAAAACGTAT TCGTAGGAGC TCACGGTAAC TCAATCCGTG 12000 

CCCTTGTAAA ACACATCAAA GGTTTGTCAG ATGACGAGAT CATGGACGT6 GAAATCCCTA 12060 

ACTTCCCACC ATTGGTATTC GAATTCGACG AAAAATTGAA CGTCGTTTCT GAATACTACC 12120 

TTGGAAAATA AAAAATTGTA AGTCTAGAAT TGATTTCTAG GCTTTTTAT6 TTAGTATGGA 12180 

AGTATGATAA GGAATAAAAA ACAAGATTAT GTACTGGCCT ACAAGCAACC AGCTTCAACC 12240 

ACTTACATGG GTTGGGAAGA AGAAGCTTTA CCGATAGGCA ATGGTTCTTT AGGAGCAAAA 12300 

GTATTTGGCC TTATAGGGGC TGAACGGATT CAATTTAATG AAAAAAGTCT CTGGTCTGGA 12360 

GGTCCACTTC CTGATAGTTC AGATTATCAG GGTGGAAATC TTCAGGATCA GTATGTTTTT 12420 

TTAGCTGAGA TTCGGCAGGC TTTGGAGAAG AGAGATTACA ATCTGGCTAA GGAACTGGCT 12480 

6AGCAGCACC TAATTGGGCC AAAAACGAGT CAATATGGGA CCTATCTGTC TTTTGGGGAT 12540 

ATTCACATTG AGTTCAGCCA GCAAGGTACG ACTTTGTCTC AGGTGACGGA CTATCAGAGA 12600 
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CAGCTGAATA 


TTAGTAAGGC 


ACTTGCGACG 


ACTTCTTATG 


TCTATAAGGG 


AACGCGATTT 


12660 


GAACGTAAAG 


CTTTTGCGAG 


TTTTCCAGAT 


GATCTCTTGG 


TTCAATGTTT 


TACTAAGGAA 


12720 


GGGTTGGAAA 


CTCTAGATTT 


TACTATAGAA 


CTATCCTTGA 


CCTGTGATTT 


GGCTTCTGAT 


12780 


GGAAAGTATG 


AGCAGGAAAA 


ATCTGATTAC 


AAGGAGTGTA 


AGTTGGATAT 


TACTGATTCT 


12840 


CATATCTTGA 


TGAAGGGAAG 


AGTTAAGGAT 


AATGATCTGC 


GGTTTGCTAG 


TTATCTAGCT 


12900 


TGGGAAACGG 


ATGGAGATAT 


TAGAGTTTGG 


TCAGATAGGG 


TTCAGATATC 


AGGAGCCAGT 


12960 


TATGCCAATC 


TCTTCTTGGC 


C6CTAA6ACG 


GATTTTGCCX: 


AAAATCCTGC TAGCAATTAT 


13020 


CGCAAGAAAC 


TAGATTTAGA 


GCAACA6GTG 


ATAGACTTGG 


TGGACACAGC 


TAAAGAAAAG 


13080 


GGCTATACCC 


AATTGAAATC 


AAGGCATATC 


GAGGACTACC 


AAGCCTTATT 


CCAGCGTGTT 


13140 


CAATTGGATT 


TGGAAGCTGA 


TGTTGACGCA 


TCCACTACAG 


ATGATTTGTT 


AAAAAATTAT 


13200 


AAGCCACAAG 


AAGGGCAGGC 


TTTGGAGGAG 


CTGTTCTTCC 


AGTATGGACG 


GTATTTATTG 


13260 


ATTAGTTCGT 


CCAGAGACTG 


CCCAGATGCT 


CTACCAGCTA 


ACCTACAGGG 


AGTCTGGAAT 


13320 


GCGGTCGACA ATCCTCCTTG 


GAATTCGGAC 


TATCACTTAA 


ATGTCAATCT 


GCAGCTGAAT 


13380 


TATTGGCCAG 


CCTATGTTAC 


CAATCTCCTA 


GAGACGGTCT 


TTCCAGTCAT 


CAACTATGTA 


13440 


GATGATTTGC 


GTGTCTAT6G 


TCGTCTAGCG 


GCTGTAAAGT 


ATGCAGGAAT 


CGTCTCTCAG 


13500 


AAAG6TGAGG AGAATGGTTG 


GTTGGTTCAT 


ACTCAAGCGA 


CTCCCTTTGG 


TTGGACGGCA 


13560 


CCTGGTTGGG 


ATTACTATTG 


GGGTTGGTCA 


CCAGCTGCCA ATGCGTGGAT 


GATGCAAACC 


13620 


GTTTATGAAG 


CCTATTTATT 


TTATAfiGGAC 


CAAGACTATC 


TCAGGGAGAA 


AATTTATCCC 


13680 


ATGTTGAGGG 


AAACGGTTCG 


TTTTTGQAAT 


GCCTTTTTAC 


ATAAGGATCA 


GCAGGCGCAG 


13740 


CGTTGGGTGT 


CTTCTCCGTC 


TTATTCCCCA 


GAACATGGGC 


CGATTTCGAT 


TGGCAATACC 


13800 


TATGACCAAT 


CTCTGATTTG 


GCAGTTATTT 


CATGATTTTA 


TTCAGGCTGC 


TCAGG/U^TTG 


13860 


GGACTGGATG 


AGGACTTGTT 


GACTGAGGTT 


AAGGAGAAGT 


CTGATTTACT 


AAATCCTTTG 


13920 


CAAATCACTC 


AATCTGGTCG 


AATCAGGGAG 


TGGTATGAGG 


AGGAAGAGCA 


GTATTTTC7VA 


13980 


AATGA6AAAG 


TGGAGGCCCA 


GCATCGGCAC 


GCTTCCCATC 


TAGTGGGACT 


CTATCCTGGC 


14040 


AATCTCTTTA 


GCTACAAGGG 


ACAAGAGTAT ATTGAAGCG6 


CGCGTGCTAG 


CCTCAATGAT 


14100 


CGTGGAGATG GCGGCACAGG CT6GTCCAAG GCTAATAAGA TCAATCTCTG GGCGCGTTTG 


14160 


GGAGATGGCA ATCGAGCCCA TAAATTATT6 


GCAGAGCAGT 


TAAAGACATC 


CACCTTGCAA 


14220 


AATCTTTGGT GTAQCCATCC TCCTTTTCAG ATAGATGGTA ATTTTGGTGC TACTAGTGGC 


14260 


ATGGCAGAAA TGTTACTCCA GTCTCATGCA GCTTATCTGG 


TACCTCTAGC TGCCCTACCT 


14340 


GATGCTTGGT CAACAGGTTC TGTTTCAGGC TTAATGGCAC GTGGACATTT TGAAGTGAGC 


14400 
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ATGAGCTGGG 


AAGATAAAAA ACTCTTACAG 


TTGACCATTT 


TATCAAGGAG 


TGGAGGAGAT 


14460 


TTGCGAGTTT 


CTTATCCAGA TATTGAGAAG 


AGTGTGATTA 


AAATGAATCA 


AGAAAAAATA 


14520 


AAAGCGA7VAT 


GCATGGGGAA 


AGATTGTATT 


TCGGTGGCAA 


CAGCAGAAGG 


TGATCTTGTT 


14580 


CAATTTTATT 


TTTAAGAAGA 


TGTTATAAGG 


CAGTAATTTG 


AAACTGCCTT 


TTAATAAGGA 


14640 


TTTAAGAATA 


TAAGCAGTTT 


TCAACTAGTT 


GAAAAAACGT 


TATAATGATA 


ATAGGAAGTA 


14700 


ATACTC/kATG 


AAAATCAAAG 


AGCACAAACT 


AGGAAGCTAG 


CCGCAGGTTG 


CTCAAAACAG 


14760 


TGTTTTGAGG 


TTGCAGATGG 


AAGCTGACGT 


GGTTl'GAAGA 


GAGATTTTCG 


AGGAGTATAA 


14820 


TTTGTTTGAT 


AGAGGGTGG6 


TCTGATGGCT 


TATATTGAGA 


TGAAACACTG 


TTACAAGCGT 


14880 


TATCAGGTTG 


GGGACACGGA 


GATTGTGGCC 


AATTGTGATG 


TGAATTTTGA 


GATTGAAAAG 


14940 


GGGGAGCTGG 


TTATTATCCT 


TGGTGCTTCA 


GGTGCAGGCA 


AGTCAACAGT 


TCTTAACCTT 


15000 


CTTGGGGGAA 


TGGATACCAA 


TGATGAAGGG 


GAAATCTGGA 


TTGATGGTGT 


TAATATTGCG 


15060 


GATTATA6TT 


CCCACCAGCG 


CACCAATTAC 


CGTAGAAATG 


ATGTGGGGTT 


TGTTTTTCAG 


15120 


TTTTATAATC 


TAGTTTCTAA 


TCTGACAGCT 


AAGGAAAATG 


TGGAACTGGC 


TTCTGAAATT 


15180 


GTGACAGATG 


CCTTGAATCC 


TGATCAGGCC 


TTGACAGATG 


TAGGTCTGGC 


TCATCGTCTC 


15240 


AATAACTTTC 


CAGCCCAGCT 


TTCTGGAGGG 


GAGCAACAGC 


GAGTCTCCAT 


TGCACXSCGCG 


15300 


6TAGCCAAAA 


ATCCTAAAAT 


TCTCCTTTGT 


GATGAACCGA 


CTGGAGCCTT 


GGATTATCAC 


15360 


ACGGGCAAGC AGGTTTTGAA AATTCTCCAA GACATGTCTC GTCAAAAGGG AGCGACGGTC 


15420 


ATCATCGTGA 


CTCATAATGG 


AGCTTTGGCG 


CCCATTGCTG 


ATCGCGTGAT 


TCAAATGCAC 


15480 


GATGCCAGTG 


TCAAGGATGT 


GGTGCTCAAC 


CAGCATCCTC 


AGGATATTGA 


CAGTTTGGAG 


15540 


TACTAGCATG 


ATCAAGCGAA AAACTTATTG 


GAAGGACTTA 


GTTCAGTCCT 


TCACAGGCTC 


15600 


CAAGGGGCGT 


TTTTTATCCA 


TCTTGATCCT 


GATGATGTTG 


GGATCTCTAG 


CCTTAGTAGG 


15660 


CCTCAAAGTA 


ACCAGTCCCA 


ACATGGAGGC 


GACAGCTAAT 


GCTTATTTAA 


CAACTGCTCA 


15720 


AACCTTGGAT 


TTGGCAGTCA 


TGTCTAACTA 


TGGCTTGGAT 


CAAGCAGACC 


AAGAAGAACT 


15780 


AAAACAGACG 


GAGGGCGCAG 


AGGTCGAGTT 


TGGCTATTTG 


ACAGATGTGA 


CTATGGATAA 


15840 


TGGGCAGGAT 


GCCATTCGGC 


TGTACTCCAA 


ACCAGAGCGA 


ATTTCAACCT 


TTCAGCTAAG 


15900 


AAAGGGACGA 


CTTCCTCAGT 


CAGACAAGGA 


AATCGCTTTG 


GCCACTCATT 


TGCAAGGCCA 


15960 


ATACAGCGTG 


GGACAGGAGA 


TTAGTTTTAA 


AGAAAAAGAA 


GAGGGTCATT 


CCTCTTTAAA 


16020 


AGACCATACT 


TATACCATTA 


CTGGTTTTGT 


GGATTCGGCT 


GAAATCCTCT 


CCCAGCGAGA 


16080 


TATGGGCTAC 


GCAGGAAGTG 


GAAGTGGGAC 


TCTGACAGCC 


TATGGGGTGA TTTTACCTAG 


16140 
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TCAATTTGAT CAGAAAGTCT ACAATATAGC TCGTTTGAAA TATCAAGATT TAGCGGGTTT 16200 

AAATGCCTTT TCATCAGCTT ATGAAGAAAA ATCCAAGCAA CATCAAGAAG AGCTTGAACA 16260 

AATTTTATCA GATAATGGCA AGGTACGTCT GCAACTTTTG AAAAAAGAAG GACAAGAGTC 16320 

TCTAGACAAG GGGCAAGAGA CCCTTGACAA GGCTCAGACT AATTTGCAGG AAGGCAAGCG 16380 

TCGTTTAGCA GCTGCTCAAG CTCGTATACA GGCTCAAGAA AGTCAACTAG CCTTGTTTCC 16440 

TCAAGTTCAG AGAGAGCAGG CTAGTGCTCA ACTTACCCAA GCCAAGCAGG AATTGGGCAA 16500 

GGAA6AGGAC AAACTAAAGC AAGCTGAACA AAATCTAGCC CAAGAAAAGG AAAAATTAGA 16560 

AAAACATCAG CAAGTCTTGG ATGATTTGGC GGAGCCAAGG TATCAGGTTT ATAATCGTCA 16620 

GACCATGCCA GGTGGTCAGG GCTATCTTAT GTATAGCAAT GCTTCATCCA GTATTCGAGC 16680 

AGTGGGCAAT ATCTTTCCTG TGGTACTTTA TGCCGTAGCA GCCATGGTGA CCTTTACGAC 16740 

CATGACrCGC TTT6TAGACG AAGAGCGAAC TCATGCAGGG ATTTTTAAGG CCTTGG6TTA 16800 

TCGTAGTAAG GATATTATCG^ CCAAGTTTCT CCTTTATGGA CTAGTAGCTG GGACTGTCGG 16860 

AACGGCTCTA GGTAGTATAC TTGGTCATTA TTTGCTAGCC AGTGTAATIT CAAGTGTCAT 16920 

TACAAAAGGC ATGGTGGTGG GAGAAACTCA GATTCAGTTC TATTGGACCT ATAGCTTACT 16980 

AGCTTTTGTC TTGAGCTTGT TGGCGAGTGT GTTACCAGCC TATCTGGTGG CTTGGAGGGA 17040 

ACTTCATGAC GAAGCAGCCC AGCTTCTACT TCCTAAACCT CCTGTCAAAG GAGCTAAAAT 17100 

CTTATTGGAG CGTATCGGTT TTATCTGGCG TCGTCTCAGT TTTACTCATA AGGTAACAGC 17160 

CCGCAACATC TTTCGTTATA AGCAGAGAAT GTTGATGACA ATCTTTGGTG TGGCAGGTTC 17220 

TGTAGCTCTG CTCTTTGCAG GTTTGGGAAT CCAATCTTCT GTAGCAGGAG TTCCGTCTAA 17280 

ACAGTTTCAA CAAATCCAAC AGTATCAGAT GCTTGTCTCT GAAAATCCTA GTGCGACCAA 17340 

TCAGGACAAG GTAGA6CTAG CAGAAGTGTT GAAAGGGCA6 GAGATACTAG CCTACCAGAA 17400 

AATCTATTCT AAAGCGCTAT ACAAGGATTT CAAAGGCAAA GCTGGTCTTC AAAACATTAC 17460 

TCTTATGATG ATAGAGAAGG AAGATTTGAC TCCCTTTATC CATCTTCAAC ATCATCAGCA 17520 

GGAGCTGACA TTAAAAGATG GCATCGTTAT TACAGCTAAA CTCGCCCAGC TGGCAGGTGT 17580 

CAAGGTTGGG CAGACTTTAG AAATTGAAGG TAAGGAACTA AAGGTCGTTG CTATTACTGA 17640 

GAACTACGTT GGTCACTTTA TTTATATGAG TCAGGCTAGC TATGAGCAAC TTTACGGACA 17700 

GCTACCCCAA GCCAACACTT ATCTGGTCTC ATTAAGGGAT ACCAGT6CAA CTAGTATCGA 17760 

AAGTCAGGCG GGCTTGCTTA TGAATCAATC TGCGGTGTCC AGCGTTGTCC AAAATGCTTC 17820 

AGCCATTCGA CTCTTC6ACT CTATCGCTAG CTCACTCAAT CAGACCATGA CCATCTTGGT 17880 

CATCGTATCG 6TTCTATTAG CTATTGTCAT CCTTTACAAT CTGACCAATA TCAACGTAGC 17940 
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TGAGAGAATC 


CGTGAACTCT 


CCACTATCAA GGTTCTTGGT TTTCATAATA ATGAAGTCAC 


18000 


CCTCTACATT 


TACCGTGAGA 


CGATTGTGCT GTCCCTTGTG GGAATCGTAC TTGGTCTGAT 


18060 


AGCTGGTTTC 


TATTTACACC 


AATTTTTGAT TCAAATGATT TCGCCTGCGA CTATTCTCTT 


18120 


TTATCCGCAG 


GTAGGCTGGG 


AAGTCTATGT AATCCCAGTG GCAGCAGTAA GCATCATTTT 


18180 


GACCTTGCTT 


GGTTTCTTCG 


TCAATTATTA TCTGAGAAAG GTTGATATGT TAGAAGCCCT 


18240 


GAAATCTGTA 


GAGTAAGGTA 


GTTATTTTTA GCTGATTGAA CTTCTATTTA CTAATATTCA 


18300 


AAAATCCTCC 


GTTTCAAAGA 


GCAGGGAACT CTTTGTGACA GAGGATTTTT TCTATAGGGC 


18360 


TTTAGCAGCT 


GCAATTGCGG 


CTTCGAAGTT TG6CTCAGAA TTGATATTAT CCACGTATTC 


18420 


AACGTAGCGA ATCGTATT6T 


CAGTATCGAG GACAAAGACT GCGCGTGCTA ATAGGTGCCA 


18480 


TTCGTTGATC 


AAGAGGGCAT 


AATCGCGCCC GAAAGAATGG TCAAAGTAGT CTGAAAGCAT 


16540 


AATGGCATTG 


TCAAGGCCTT 


CAGCACCGCA CCAACGTTTT TGAGCAAAAG GTAGGTCCAT 


18600 


TGAAACAGTC 


AATACGACCG 


TGTTGTCCAG TCCAGCCAAT TCTTCATTAA AACGACGTGT 


18660 


TTGAQTTGAG 


CAGATGCCT6 


TATCGATAGA A6GAACGACA CTCAAGACTT TTTTCTTGCC 


18720 


ATCAAAATCA 


GCCAGAGATT 


TTTTAGAAAG ATCTGTTGTA GTAAGAGA7UV AATCAAGCGC 


18780 


CTTGTCGCCG 


ACTTGTAGTT 


GTTTACCTGT AAAGCTCACA GGATTTCCGA GAAAAGTTAC 


18840 


CATAGGATAC 


TCCAATCTTT 


TTTCTTCCAT TTTAGCTGAA ACAGTCGGAA TTTTCCAATG 


18900 


ATTTGACCGG AAATATGGGC 


ATAGAAAAAA CGCCAGCTCA TGTGAGAATG ACGTTTTTCA 


18960 


TAGGTTTATT 


TTGCCAATCC 


TTCAGCAATC TTGTCAAGGT TGTATTTCAT CATGCTGTAG 


19020 


TAGCTGTCGC 


CTTCTTTACC 


TT6TTCTGCG ATAGAGTCAG TAAAGATTTG AGCGTAGATT 


19080 


GGGATGTTTG 


TGTCTTGAGA 


AACAGTTTTC ATTGGACGGT CATCCACACT TGATTCTACA 


19140 


AAGAGTGATG 


GAACTTTTGT 


TTG6CGAAGT TTTTCAACCA AGGTCTTGAT TTGTTCAGGA 


19200 


GTTCCTTCTT 


CTTCAGTATT 


GATTTCCCAG ATGTAAGCAC TTG6GACACC ATAGGCTTTA 


19260 


GAGAAGTATT 


TGAATGCTCC 


TTCGCTGGTT ACAATGAGTT TCTTTTCAGC AGGGATCTTA 


19320 


TTAAATTTAT 


CCTTACTTTC 


TTTATCAAGT TTGTCTAACT TATCAGTATA TTCTTTGAGA 


19380 


TTTTTTTCAT 


AGAATTCTTT 


ATTGTTAGGG TCTTTGGCGC TCAATTGTTT GGCGATATTT 


19440 


TTAGCAAAAA 


TAATACCGTT 


TTCAAGGTTA AGCCAAGCGT GTGGGTCTTC TTTTCCTTTT 


19500 


TCATTTTGAC CTTCAAGGTA GATAACATCA ACX3CCGTCGC TGACTGCGAA GTAGTCTTTG 


19560 


TTTTCAGTTT TCTTGGCATT TTCTACCAAT TTTGTAAACC AAGCATTGCC ACCTGTTTCA 


19620 


AGGTTGATAC CGTTATAGAA AATCAAATTA GCCTCAGAAG TTTTCTTAAC GTCTTCAGGA 


19680 
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AGTGGTTCGT ATTCGTGTGG GTCTTGCCCA ATCGGAACX3A TACTATGAAG GTCAATTTTG 19740 

TCACCAGCAA TATTTTTAGT AATATCAGCG ATGATTGAGT TTGTAGCAAC AACTTTTAGT 19800 

TTTTGACCAG AAGTTGTATC TTTTTTTCCG CTAGCACATG CTACAAGAAT GATTGCAGAA 19860 

AGAAAGAGAA CGAGTAATGT ACCTAATTTT TTCATTAGAT CCTCCAATTT ATTAGGGCTT 19^20 

TGCCCCTTAT TTTAACAAAT GTTTATTTTT CAGTTTCAAA TATCGTTGTT TGGGAGCGAT 19980 

AAAGAAGCTA ATGAGAAAGA AACTAGCAGC TGTAAGCACG ATACTAGAAC CTGCCGCAAC 20040 

ATTAAAACTA TAGCCAATAA AGAGTCCCAA AACTGAAGCA GTAGCTCCGA AGGTTGAGGA 20100 

AAGGAAAATC ATACTTTTCA GACTATTAGC ATACAGATAA GCAGTTGCAG CTGGGGTAAT 20160 

CAGCATGGCT ACAATCAGGA TAGTTCCGAC ACTTTGCATG GCTGTCACAG ACACGAGAGT 20220 

CAGGAGTACC ATGAGAAGGT AGTGATA6AA ATTGACAGGC ATTCCCATGG CTTTAGCCAA 20280 

GAGTTCATCA AAGGAAGTTA TCAAGAGTTG CTTGAAGAAA ATCCA6ATTA ACAAGAGGAT 20340 

AGCTGCCCCC ACACCCATAG TAATAAACAT ATCCGTATCT TGGACGGCCA GGATATTACC 20400 

AAAAAGGATA TGGAAAAGGT CAGTTGAACT TTTAGCGACA CCAATCAAGA TGATACCGAG 20460 

GGCTAAGAAA GAAGAAAAGG TAATGCCGAT GGCGGTATCG CTTTTGATAA TCGAGTTTCC 20520 

TTTGATGTAG GTAATGATGA TGGCAGCTAG CAATCCAAAG ACAATGGCTC CGATAAAGAA 20580 

GTCAAGGCCC AAGATGAAGG ATAGGGCTAC ACCTGGTAAG ACAGCATGTG AAATGGCATC 20640 

TCCCATGAGT GACATCCCGC GTAGAATAAT GAAACATCCC ACAGCTCCAG CTACAATCCC 20700 

GACGACAATA GCTGTTATCA AGGCATTTTG TAGGAAATGG AATTTTTGCA ATCCATCGAT 20760 

AAATTCTGCA ATCATAGGTC ACCTCCATTG AAAAAGAGTT GATTACCGTA AGCTTCTTTT 20820 

AGATTGGTTT CGGTAAAAGT rrCTTTTGTT GGACCAAAGG CAATCACTTC TCGATTGACA 20880 

AGTAAGACTT GATCGAAGTA GTGGGGAATC TTGCTGAGGT CGTGGTGAAC GATGAGAACC 20940 

GTCTTCCCAG CTTTTTTCAA ATCTCTCAGC GTATTCATGA TGATTTCCTC ACTGACAGAG 21000 

TCAATCCCAG CAAAGGGTTC ATCCAAGAGG ATATAGTCGG CTTCCTGCAC CAAACATCTG 21060 

GCAATCAAGA CCCGCTGGAA TTGACCTCCA GACAGTTGAC TAATTTGACG TTCAGCGTAG 21120 

TCAGCTAGGC CGACGATTTC AAGGGCCTCT TGCACTTTCT TCCAATGTTT AGCCTTTAAA 21180 

CTTCGAAAGA GAGGAATAGA GGGAAATAGT CCTAACGAGA CGCATTCCTT GACCTT6ATG 21240 

GGAAAGTTGT AGTCGATATT GATTTTTTGT TCGACATAGG CAATTCGGTG TAAGGATTTT 21300 

TTAACTTCCT TGTCATCGAG AAATGCCTGA CCTTGATGTG GGATAATTCC CAACATACCT 21360 

TTTAATAGTG TTGATTTCCC AGCGCCGTTT GGACCAATGA TGCCGGTAAT TGTTGGTCCA 21420 

TGGAGCACTA GTGAAATATC CTTAAGTGCC AACGTTTCTT TGTAGGAGAC ACTGAGGTTT 21480 
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TCGATACGTA TCATAAACTT GTATTCCTCC TGTCTCTTAA TATACATTAA AAAAAAAATT 21540 

AAGTCAAGTT AATTTTTGAA AAAATTAAAA TAATAACTGA AAAATAGATT CTAAAGATAA 21600 

CTTTCAGGAT AAATTTCTAA ATTATAAAAC GCATAGTATC AAGTGTAAAA AACTTGGAAT 21660 

TATGCGTTTT ATCATGGAAA GATTTTTTAT AATAGCTAAA AAATAA 21706 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 6171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(O) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GATCCCCAGG AAAAACCGAG GTTTTCCCAA TCAATCGTTA CTGTCATATT CCACTCCTTA 60 

TTCTAAAAAC CTATTTCTTA TATTCTACAC TATTTTTCTA AAATAGCAAG TATATTTTGT 120 

AATTTTCAGA AAATTTCTCC AATAAAAACC AACTCTTAGA ACTGATTCTT CATTTCACTT 180 

ATTTATCTTC AGTAACTACT TCCTGAAGAT AAGCGTCAAA AACTTCTTCA TCTGAAATCG 240 

TGTCAGAAAT GAAGCTTCCA TTGCTAGTGC GTTCTGACAA GTTCAAGTCT TGCAATCGGC 300. 

TTTCATAGAT TGTTCCTTTA TTGGATTGGA CAAGCAGAGT TTGGTCGTTC ACATCCACTT 360 

CCGTACTGAA GAAATCGCCA ACAAATCCTT GCTCTGCAAC TGCTCCTGCC AAGAAGACAC 420 

GATGCGGTTT GTTTTTCAAC TCACGCAAGA CTTGTAATCC TCGTTTGGCA CGGCTGGTTG 480 

CTAGAATTTC CTCAATGGAA ACACGTTTCA AGCTTCCACG CTGGGTCAAG AGGTAGAAGG 540 

ACGAAGTATT ACAGATAAAG CCAGATTGGA GGACATCATC TTCTTTCAAA TTCATAGCCT 600 

TGACACCTGC TGCCTTAGCA CCGACAACCG GAACCTCTTC GATATTGAAA CGCAGGGCAT 660 

AACCATTTTG ACTAACCAAG ACAACATCAT CTAGTTTAAT CGGAGCCACT GCTACAATCT 720 

GATCTGTATC GTCTTTGAGC TTAGCATACT TGACAGACTT AGATCTATAG GTCCGCCATG 780 

GAGTGAATTC TTTTCGCTCT ACCCGTTTGA TTTGACCAAG GCGAGTCACT GCAAAGTAGG 840 

TTGTCGCATC GTCAAACTGA TCCAGTACTT CCACATAAAG GATTTCTTCA TTCGTTTCAA 900 

AGTTTGTGAT GGTTTGGCTC AGATGCTCTC CGATGTCCTT CCAACGAATA TCTGCCAACT 960 

CAT6GATT6G TCT6TAGATG ACATTTCCAA GACTTGTGAA CATCAAGAGG TGCTGGGTTG 1020 

TCTTGGCAGA TTGAACAAAA ATCAAACGGT CATCATCACG CTTGCCAATT TCTTCCAAGG 1080 

TGGAAGCCGC AAAGGAACGT G6ACTGGTAC GCTTGATGTA ACCTGCCTTG GTCACGCTGA 1140 
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CGTAGGTATC TTCCTCAGCG ATAAGACTAG CTGTATCAAT CTCAATTGCT TTCGCAGTGT 1200 

CTTCTAAAGA ACTCAAACGA GGAGTTGCAA ATTTCTTCTT GACCTCACGA AGTTCTTTCT 1260 

TCATGAGATT GTACATAGTC CTTTCATCAC CGATAATAGC CGCCAGCATA GCAATCTTCT 1320 

CACGAAGCTC TGCTTCTTCT TCCTGCAAGA CAACCACATC GGTATTGGTC AAACGGTACA 1380 
GTTGCAAAGT TACGATAGCC TCAGCCTGTT CTTCCGTAAA ATCATAGCTA ACTTTGAGGT ' 1440 

TTTCCTTGGC GTCCGCCTTA TTCTCAGAAG CACGGATAAG AGCAATGACT TCATCCAAAA 1500 

TCGAAATCAC ACGAATCAAA CCTTCGACGA TATGGAGACG TTTCTCAGCC TTTTCTTTGT 1560 

CAAAGCGTGA ACGCGCCAAA ATCACTTCTC GACGGTGAGC GATATAGCTA GACAGGATTG 162a 

GAACAATCCC AACCTGACGA GGTGTGAAAT TGTCAATCGC CACCATATTA AAGTTGTAGT 1680 

TGATTTGTAG GTCGGTGTAC TTAAATAAGT AGTTGAGAAC AAGCTCAGTA TTAGCX3TCTT 1740 

TCTTAAGTTC GATAGCGATA CX^GACCAT CACGGTCAGA CTCATCACGA ACCTCAGCAA 1800 

TCCCAGCTAC CTTGTTATTA ACACGAACAT CATCGATTTT CTTGACTAGA TTGGCCTTAT 1860 

TGATTTCATA AGGAATCTCA ATAATAACGA TTTGTTCCTT ACCACCTTTT AGCTTTTCAA 1920 

TTTCAGTCTT GGAACGAACA ACCACGCGCC CTTTCCCAGT CTCATAAGCT TTCTTGATTT 1980 

CATCACGACC CT6AATAATA GCCCCTGTAG GGAAGTCTGG TCCAGGCAAG AATTCCATGA 2040 

GTTTATCAAT CTTTGCAGTT GGGTGGTCAA TCATGTAAAC TGCAGCATCT ATGACCTCAG 2100 

CTAAATTATG GGGAGGAATG TCTGTGGCAT AACCAGCCGA AATCCCAGTC GAACCATTGA 2160 

CCAAGAGGTT TGGAAAGGCT GCTGGCAAGA CCGTTGGTTC TTTCTCCGTA TCGTCAAAGT 2220 

TCCATGCAAA AGGAACTGTC TTTTTCTCGA TATCCTGAAG AAGGTAGCCT GCAATTTCAG 2280 

ACAAACGTGC CTCAGTATAA CGCATAGCCG CAGGAGGATC TCCGTCCATA GAACCGTTAT 2340 

TACCGTGCAT TTCAACTAGA ATCTCACGAT TTTTCCAGTT CTGTGACATA CGAACCATGG 2400 

CATCATAGAT AGAAGAATCC CCGTGTGGGT GGAAATTCCC CATGATGTTC CCGACTGACT 2460 

TGGCCGACTT ACGGTAGCTC TTGTCAAAAG TATTGCTATC CTTATTCATA GAATAAAGAA 2520 

TACXSGCGCTG AACCG6CTTC AACCCATCAC GAATATCTGG CAAAGCCCGG TCTTGAATAA 2580 

TGTACTTGGA GTAGCGACCA AAGCGCTCTC CCATGATGTC CTCCAGGGAC ATGTTTTGAA 2640 

TGTTAGACAT AAGATACAAA GCCCATAAAA TACCAAGTGA AAATAGAAAA TTCTTGAAGT 2700 

AAGCAAACTC ACAAGAGAAT TTATCTTTTT CACACAGTAT CTAGGGCGTG TTCAACTCCT 2760 

TTCAAAGAAT GTAGAGTAGG TTTTTATGCA GTAAAAGATA TTTTACGGGA ATTCCTCCCG 2820 

TGTTCAGTTA CGATAAGTAA CCAAACTTATC CTGTTTGTAT TTTTCAATAT GAAAATCTGG 2880 

TTTTCCAAAA TTAGTCTTAG TTTGTGTCTT AGCCGCTCCC TTAAGCGCCT CTTTGAGATA 2940 
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AGCACTCATA 


GCAGATTCTT 


CATTAATAAT 


CCTGCAATTT 


TTTCAAACCA 


AGATTTTCAA 


3000 


ACTGCTTTTT 


CACATAGTCA 


TTCACATCCG 


ACTCTAATTT 


CCAGTTTACT 


AACATATTAT 


3060 


TTTCTTTCAT 


TAAAACACTG 


TCGTTTCTTC 


TAGCGTAAAC TTGACATTAT 


CTTCAATCCA 


3120 


TTTACGGCGT 


GGTTCTACCT 


TATCTCCCAT 


GAGAACATTG 


ACGCGGCGTT 


CGGCGCX3CGC 


3180 


TAAATCTTCA 


ATTGTGACAC 


GGATGAGGGT 


ACGTGTTTCT 


GGGTTCATGG 


TTGTTTCCCA 


3240 


GAGCTGGTCC 


GCATTCATCT 


CACCAAGTCC 


TTTGTATCGT 


TGGAGGGTAG 


CGCCTTTACC 


3300 


GAACTGTTTA 


CGGAGTTCTT 


CTAGTTCTCC 


GTCCGTCCAA 


GCGTAGGCCA 


CTTCTTCTTT 


3360 


CTTGCCTTTA 


CCTTTGGACA 


TCTTGTAAAG 


AGGTGGGAGG 


GCAATATAGA 


CATGACCTGC 


3420 


CTCGACTAGC 


GGACGCATGT AACGGTAGAA AAATGTCAAG AGCAAGGTCT 


GGATATX3GGC 


3480 


ACCGTCGGTA 


TCCX3CATCGG 


TCATGATAAT 


GATCTTATCA 


TAGTTGGCAT 


CTTCAATAGA 


3540 


GAAGTCTGCT 


CCAACACCCG 


CACCAATGGT 


ATAAATCATG 


GTATTGATCT 


CTTCATTTTT 


3600 


6AGGATATCC 


GCCATCTTGG 


CCTTGGCTGT 


ATTGACAACC 


TTACCACGAA 


GAGGTAGAAT 


3660 


AGCCTGGAAC 


TTGCGGTCAC 


GACCTTGTTT GGCAGAACCA 


CCG6CAGAGT 


CCCCCTCAAC 


3720 


TAGATAGAGT 


TCATTCTTAG 


CAGGATTCTT 


AGATTGGGCT 


GGGGTCAATT 


TCCCAGACAA 


3780 


CAAGCX:CTTA 


TCTTTCTTGT 


TTTTCTTCCC 


ATTTCGGCTC 


TCATCACGCG 


CCTTACGTGC 


3840 


TGCTTCACGA 


GCATCACGGG 


CCTTGATAGC 


CTTGCGGATG 


AGGTTAGAAG 


CTAATTCCCC 


3900 


ATTTTCCATA 


AGGAAAAAGG 


TCAACTTATC 


AGCCACTATT 


CCATCCACAA 


CTGGGCGAGC 


3960 


TAGGGGGCTT 


CCTAGTTTAT 


CCTTGGTCTG 


TCCTTCAAAC 


TGCAAGTGTT 


CTTCAGGAAC 


4020 


TAAGATAGAA AGAACGGCCG 


CTAGTCCCTC 


ACGATAGTCT 


GAACCTTCAA 


GGTTTTTATC 


4080 


TTTTTCCTTG 


AGAAGACCTG 


TTTTACGTGC 


ATAGTCATTC 


ATGACCTTGG 


TAATGGCA6A 


4140 


CTTGAGTCCT 


GTCTC6TGCG 


TTCCACCGTC 


CTTGGTGCGA 


ACGTTATTGA 


CAAAAGATAG 


4200 


AATGTTATCT 


GAGAATCCGT 


CATTGTACT6 


GAGGGCTACT 


TCCACTTGAA 


AACCATTGTC 


4260 


TTCCCCTTCA AAGTAAAGAA 


CTGGCGTCAA 


GATTTCCTTA 


TCTTCGTTGA 


GATAAGAAAC 


4320 


AAAATCTTGT 


ACTCCATTCT 


CATAGTGGAA 


CTCAATCGCT 


TCATTTGTTC 


GCTTGTCCGT 


4380 


TAAAGACAAG 


GTCACATTTT 


TCAAGAGAAA 


GGCTGATTCA 


TTAAGGCGCT 


CTGAAATGGT 


4440 


ATTGTACTTG 


AAATCTGTCG 


TAGAAAATAT 


AGTCGCGTCA 


GGCATAAAf^G 


TAACTTTGGT 


4500 


GCCTGTTTTA 6ACTTGGGT6 CTGTACCGAT TTTCTTCAAA GTCGT6ACAG GTTTTCCACC 


4560 


ATTTTCGAAA CGTTGCTTGT 


AAACTGCGCC 


ATCACGGGTA ATTTCAACTT 


CTAACCAGCT 


4620 


A6AAAGGGCG 


TTAACAACGG 


AAGAACCCAC 


TCCXyrCAAGT 


CCACCTGATG 


TCTTATAGCC 


4680 
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ACCTTGACCG AATTTCCCTC CGGCATGAAG AATGGTAAAG ATAACCTCAA CAGTTGGAAT 4740 

TCCCATAGCX3 TGCATACcTG TCGGCATCCC ACGTCCATGG TCTTGAACCG TTAGACTACC 4800 

GTCTTTATTG ATAGTTACAT CAATACGATC ACCAAACCCA GACAAGGCTT CATCGACTTGC 4860* 

ATTATCAACG ATTTCCCAAA CTAGGTGATG AAGACCAGCG CCATCGGTCG ATCCAATATA 4920 

CATCCCTGGA CGTTTTCGGA CCGCATCCAA CCCTTCTAGC ACCTGAATAG CATCATCATT 4980 

ATAATTGTTA ATATTGATTT CCTTTTTTGA CACAAGGAAC CTCCTATTCG TTCATCTTTA 5040 

CTATTCTACA GGTTTTCCAA GGATTTTGCA AAATTTTTCT TTCTCCGATG TGACAATTTC 5100 

AGCAGAGATT CTCTGCTTTT CTTTCCCAAT TCATGATATA ATAGGAGTAT GATTACAATA 5160 

GTTTTATTAA TCCTAGCCTA TCTGCTGGGT TCGATTCCAT CTGGTCTCTG GATTGGACAA 5220 

GTATTCTTTC AAATCAATCT ACGCGAGCAT GGTTCTGGTA ACACTGGAAC GACCAACACC 5280 

TTCCGCATTT TAGGTAAGAA AGCTGGTATG GCAACCTTTG TGATTGACTT TTTCAAAGGA 5340 

ACCCTAGCAA CGCTGCTTCC GATTATTTTT CATCTACAAG GCGTTTCTCC TCTCATCTTT 5400 

GGACTTTTGG CTGTTATCGG CCATACCTTC CCTATCTTTG CAGGATTTAA AGGTGGTAAG 5460 

GCTGTCGCAA CCAGTGCTGG AGTGATTTTC GGATTTGCGC CTATCTTCTG TCTCTACCTT 5520 

GCGATTATCT TCTTTGGAGC TCTCTATCTT GGCAGTATGA TTTCACTGTC TAGTGTCACA 5580 

GCATCGATTG CGGCTGTTAT CGGGGTTCTG CTCTTTCCAC TTTTTGGTTT TATCCTGAGT 5640 

AACTATGACT CTCTCTTCAT CGCTATTATC TTAGCACTTG CTAGTTTGAT TATCATTCGT 5700 

CATAAGGACA ATATAGCTCG TATCAAAAAT AAAACTGAAA ATTTGGTCCC TTGGGGATTG 5760 

AACCTAACCC ATCAAGATCC TAAAAAATAA AATGCCAGTT CTGTACTGCC CCCAAACAGT 5820 

TAGACAAATA ATTTATCCAA AGGATTTAGT TCTGTACTGC ACAGGACTAA GTCCTTTTAG 5880 

TTTTACCTTA ATTCGTTTGT TGTTGTAGTA ATCAATATAG TCTATAATGG CTTGTTCCAA 5940 

TTGATTAAGT GATTTAAATG TTTTCTCATA GCCATAAAAC ATTTCX3GATT TTAAAATGCC 6000 

AAAGAAAGAT TCCATCCTAC CGTTGTCTTG GCTGTTGCCC TTACGTGACA TGGATGCTTG 6060 

AATTCCCTTA CTCTCTAGGA ACCGATGATA AGAATCGTGT TGGTATTGCC AGCCTTGGTC 6120 

ACTATGGAGA ATCGTATTCT CGTAGTGCTT CTCTGTGAAT GCCTGTTCCA A 6171 
(2) INFORMATION FOR 5EQ ID NO: 38: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18475 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



TATTACAAAT 


AAAAAAACGG 


AGGAGTGCTT 


TATGAAAGCC 


TATACTTATG 


TTAAACCAGG 


60 


ACTTGCTTCT 


TTTGTTGATG 


TAGACAAACC AGTTATTCGC 


AAGCCAACAG 


ACGCTATTGT 


120 


GCGTATTGTA AAAACCACTA 


TTTGTGGAAC 


AGACCTCCAT 


ATTATCAAAG 


GGGATGTTCC 


180 


TACTTGCCAA AGTGGTACCA TTCTTGGCCA CGAAGGGATT 


GGGATTGTTG 


AAGAAGTTGG 


240 


GGAAGGAGTT 


TCCAACTTCA 


AAAAAGGTGA 


CAAGGTCTTG 


ATTTCTTGCG 


TCTGTGCCTG 


300 


TGGTAAATGC 


TACTACTGTA 


AAAAAGGAAT 


TTATGCTCAC 


TGTGAAGACG 


AAGGGGGCTG 


360 


GATTTTCGGT 


CACTTGATTG 


ATGGTATGCA 


GGCTGAATAT 


CTACGTGTCC 


CTCATGCAGA 


420 


TAATACTCTT 


TACCATACTC 


CAGAAGACTT 


GTCAGATGAA 


GCTTTGGTTA 


TGCTGTCAGA 


480 


CATTCTGCCT 


ACTGGATATG 


AAATTGGTGT 


CTTAAAAGGG 


AAAGTAGAAC 


CTGGTTGCAG 


540 


CGTAGCCATT 


ATTGGTTCAG 


GTCCAGTTGG 


ATTGGCTGCT 


CTTTTAACAG 


CCCAATTCTA 


600 


TTCACCAGCT 


AAATTGATTA 


TGGTAGACCT 


AGACGATAAC 


CGCTTGGAAA 


CTGCCCTATC 


660 


ATTCGGTGCG 


ACTCATAAGG 


TTAATTCTTC 


AGACCCTGAA AAAGCCATTA 


AAGAAATTTA 


720 


TGATTTGACA 


GATGGTCGTG 


GTGTGGATGT 


CGCTATCGAA GCTGTTGGTA TTCCTGCAAC 


780 


ATTTGATTTC 


TGTCAAAAGA 


TTATCGGTGT 


AGACGGAACG 


GTTGCCAACT 


GTGGTGTGCA 


840 


TGGTAAACCA 


GTTGAATTCG 


ATTTAGATAA 


ACTTTGGATT 


CGCAACATCA 


ATGTAACAAC 


900 


TGGTTTGGTA 


TCTACAAATA 


CGACTCCACA 


ATTGTTGAAA 


GCACTTGAAA 


GTCATAAGAT 


960 


TGAACCGGAA AAATTGGTAA 


CTCACTATTT 


CAAACTCAGT 


GAAATTGAAA AAGCCTACGA 


1020 


AGTCTTCAGT AAGGCAGCAG ACCACCATGC CATTAAGGTC ATTATCGAAA ACGATATCTC 


1080 


AGAAGCCTAA GTAGTAAAAA TATTTTTGTA CATAAGTAAA TAGAAATTCA GTCATCCATC 


1140 


AGATGGCTGG ATTTTTTATC AAAAAATTAA GAAATGAGCA TATTTCTTTC CTTGTCTGGC 


1200 


GGAATTGGTT 


ATAATATACG 


GTACAAAGGA 


ATGAATGAAT 


ATGTATCGTG 


TTATAGAAAT 


1260 


GTACGGAGAT 


TTT6AACCGT 


GGTGGTTCTT 


AGAAGGTTGG 


GAAGAAGATA 


TTGTAGCAAG 


1320 


TAGAAAATTT 


GACCAGTATT 


ATGATGCTCT 


CAAATACTAC 


AAAACTTGCT 


GGTTTAGATT 


1380 


GGAACAAGAA 


TCGCCTCTTT 


ATAAAAGTAG 


T^GCGACTTG 


ATGACCATTT 


TTTGGGACCC 


1440 


GGAAGACCAA 


CGCTGGTGTG 


ATGAATGTGA 


TGAGTATTTA CAACAATACC 


ATTCTTTGGC 


1500 


TCTTTTGCAG 


GATGAGCAGG 


TTATCCCAGA 


CGAAAAACTA 


CGCTCAGGCT 


ATGAAAAACA 


1560 


AACCAGTCAG 


GAAAGGAATC 


GTTCTTGCCG 


TATGAAATTA AAATAGAGAA 


AAGTAACTTT 


1620 


TTTGGAGTTG 


CTTTTTTTAT 


TTTTCTAACT 


CTTTGCGAAT 


AGTATAGGTG 


AGGAGGTAAG 


1680 
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TATGGTTCAA GAAATTGCAC AAGAAATCAT TCGTTCAGCT CGGAAAAAAG GGACGCAGGA 1740 

TATCTATTTT GTCCCTAAGT TAGACGCCTA TGAGCTTCAT ATGAGGGTAG GAGACGAGCG 1800 

CTGTAAAATT GGTAGCTATG ATTTTGAAAA GTTTGCAGCC GTTATCAGTC ACTTTAAGTT 1860 

TGTGGCGGGT ATGAATGTGG GAGAAAAAAG ACGTAGTCAA CTGGGTTCCT GTGATTATGC 1920 

CTATGACCAT AAGATAGCGT CTCTACGTTT ATCTACTGTA GGCGATTATC GGGGGCATGA 1980 

GAGTTTGGTT ATCCGTTTGT TGCACGATGA GGAGCAGGAC CTGCATTTTT GGTTTCAGGA 2040 

TATTGAAGAA TTAGGCAAGC AGTACAGGCA ACGG6GACTC TATCTTTTTG CTGGTCCGGT 2100 

TGGGAGTGGT AAGACGACCT TGATGCATGA ATTGTCCAAG TCACTCTTTA AAGGACAGCA 2160 

AGTTATGTCC ATCGAAGATC CTGTCGAAAT CAAGCAGGAC GACATGCTTC AGTTGCAGTT 2220 

GAACGAAGCA ATCGGCCTAA CCTATGAAAA TCTAATCAAA CTTTCCTTGC GTCATCGACC 2280 

AGATCTCTTG ATTATCGGAG AAATTCGTGA CA6CGAGAC6 GCGCGTGCAG TGGTCAGAGC 2340 

TAGTTTGACA GGTGCGACAG TCTTTTCAAC CATTCACGCC AAGAGTATCC GAGGTGTTTA 2400 

TGACCGTCTG CTGGAGTTGG GTGTGAGTGA AGAAGAATTG GCAGTTGTTC TGCAAGGAGT 2460 

CTGCTACCAG AGATTAATCG GGGGAGGAGG AATCGTTGAC TTTGCAAGCA GAGATTATCA 2520 

AGAACACCAA GCAGCCAAGT GGAATGAGCA AATTGACCAG CTTCTTAAAG ATGGACATAT 2580 

CACAAGTCTT CAGGCTGAGA CGGAAAAAAT TAGCTACAGC TAAGCAAAAA AATATCATCA 2640 

CCCTATTTAA CAATCTCTTT TCTAGCGGTT TTCATCTGGT GGAGACTATC TCCTTTTTAG 2700 

ATAGGAGTGC TTTGTTGGAC AAGCAGTGTG TGACCCAGAT GCGTGTGGGC TTGTCTCAGG 2760 

GGAAATCATT CTCA6AAATG ATGGAAAGTT TGGGATGTTC AAGTGCTATT GTCACTCAGT 2820 

TATCCCTAGC TGAAGTTCAT GGCAATCTCC ACCTGAGTTT GGGAAAGATA GAAGAATATC 2880 

TGGACAATCT GGCTAAGGTC AAGAAAAAAT TGATTGAAGT AGCGACCTAT CCCTTGATTT 2940 

TGCTGGGTTT TCTTCTCTTA ATTATGCTGG GGCTACGGAA TTACCTGCTC CCACAACTGG 3000 

ATAGTAGCAA TATTGCCACC CAAATTATCG GTAATCTGCC CCAAATTTTT CTAGGCATGG 3060 

TAGGGCTTGT TTCCGTGCTT GCCCTTTTAG CACTCACTTT TTATAAAAGA AGTTCTAAGA 3120 

TGAGTGTCTT TTCTATCTTA GCACGCCTTC CCTTTATTGG AATCTTTGTG CAGACCTACT 3180 

TGACAGCCTA TTATGCACGT GAATGGGGGA ATATGATTTC ACAGGGAATG GAGTTGACGC 3240 

AGATTTTTCA AATGATGCAG GAACAAGGTT CCCAGCTCTT TAAAGAAGTC GGTCAAGATC 3300 

TGGCTCAAAC CCTGAAAAAT GGCCGTGAAT TTTCTCAGAC GATAGGAACC TATCCTTTCT 3360 

TTAGGAAGGA ATTGAGTCTC ATCATAGAGT ATGGGGAAGT TAAGTCCAAG CTGGGTAGTG 3420 

AGTTGGAAAT CPATGCTGAA AAAACTTGGG AAGCCTTTTT TACCCGAGTC AACCGCACCA 3480 
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TGAATTTGGT GOVOCCACTG GTTTTTATCT TTGTGGCACT GATTATCGTT TTACTTTATG 3540 

CGGCAATGCT CATGCCCATG TATCAAAATA TGGAGGTAAA TTTTTAAAAT GAAAAAAATG 3600 

ATGACATTCT TGAAAAAAGC TAAGGTTAAA GCTTTTACAT TGGTGGAGAT GTTGGTGGTC 3660 

TTGCTGATTA TCA6CGTGCT TTTCTTGCTC TTTGTACCTA ATCTGACCAA GCAAAAAGAA 3720 

GCAGTCAATG ACAAAGGAAA AGCAGCTGTT GTTAAGGTGG TGGAAAGCCA GGCAGAACTT 3780 

TATAGCTTAG AAAAGAATGA AGATGCTAGC CTAAGAAAGT TACAAGCAGA TGGACGCATC 3840 

ACGGAAGAAC AGGCTAAAGC TTATAAAGAA TACAATGATA AAAATGGAGG AGCAAATCGT 3900 

AAAGTCAAT6 ATTAAGGCXTT TTACCATGCT GGAAAGTCTC TTGGTTTTGG GACTTGTGAG 3960 

TATCCTTGCC TTGGGCTTGT CCGGCTCTGT CCAGTCCACT TTTTCAGCGG TAGAGGAACA 4020 

GATTTTCTTT ATGGAGTTTG AAGAACTCTA TCGGGAAACC CAAAAACX3CA GTGTAGCCAG 4080 

TCAGCAAAAG ACTAGTCTGA ACTTAGATGG GCAGACGCTT AGCAATGGCA GTCAAAAGTT 4140 

GCCAGTCCCT AAAGGAATTC AGGCCCCATC AGGCCAAAGT ATTACATTTG ACCGAGCTGG 4200 

GGGCAATTCG TCCCT6GCTA AGGTTGAATT TCAGACCAGT AAAGGAGCGA TTCGCTATCA 4260 

ATTATATCTA GGAAATGGAA AAATTAAACG CATTAAGGAA ACAAAAAATT AGGGCAGTGA 4320 

TTTTACTGGA AGCAGTAGTC GCTCTAGCTA TCTTTGCCAG CATTGCGACC CTCCTTTTGG 4380 

GACAAATTCA AAAAAATAGG CAAGAGGAAG CAAAAATCTT GCAAAAGGAA GAAGTCTTGA 4440 

GGGTAGCTAA GATGGCCCTG CAGACGGG6C AAAATCAGGT AAGCATCAAC GGAGTTGAGA 4500 

TTCAGGTATT TTCTAGTGAA AAAGGATTG6 AGGTCTACCA TGGTTCAGAA CAGTTGTTGG 4560 

CAATCAAAGA GCCATAAGGT CAAGGCTTTT ACCTTGTTAG AATCCCTGCT TGCCCTCATT 4620 

GTCATCAGTG GGGGATTACT CCTTTTTCAA GCTATGAGTC AGCTCCTCAT TTCAGAAGTT 4680 

CGCTACCAGC AACAAAGCGA GCAAAAGGAG TGGCTCTTGT TTGTGGACCA ACTTGAGGTA 4740 

GAATTAGACC GTTCGCAGTT CGAAAAAGTA GAAGGCAATC GCCTATACAT GAAGCAAGAT 4800 

GGCAAGGACA TCGCCATCGG TAAGTCAAAG TCAGATGATT TCCGTAAAAC GAATGCTCGT 4860 

GGTCGAGGTT ATCAGCCTAT GGTTTATGGA CTCAAATCTG TACGGATTAC AGAGGACAAT 4920 

CAACTGGTTC GCTTTCATTT CCAGTTCCAA AAAGGCTTAG AAAGGGAGTT CATCTATCGT 4980 

GTGGAAAAAG AAAAAAGTTA AGGCAGGTGT TCTCCTCTAC GCAGTCACCA TAGCAGCCAT 5040 

CTTTAGTCTT TTGTTGCAAT TTTATTTGAA CCGACAAGTC GCCCACTATC AAGACTATGC 5100 

TTTGAATAAA GAAAAATTGG TTGCTTTTGC TATGGCTAAA CGAACCAAAG ATAAGGTTGA 5160 

GCAAGAAAGT GGGGAACAGT TTTTTAATCT AGGTCAG6TA AGCTATCAAA ACAAGAAAAC 5220 
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TGGCTTAGTG ACGAGGGTTC GTACGGATAA GAGCXTAATAT GAGTTTCTGT TTCCTTCAGT 5280 

CAAAATCAAA GAAGAGAAAA GAGATAAAAA GGAAGAGGTA GCGACCGATT CAAGCGAAAA 5340 

AGTGGAGAAG AAAAAATCA6 AAGAGAAGCC TGAAAAGAAA GAGAATTCAT AGTCAATTCA 5400 

ACTATAATGC GTTGAATCCA GAATAGTCCA CTGTAGTTTC TAGAAAATTG CTGGAAATGG 54 60 

ATGTTAAGCT CCAATTCATT TGTTTATATC TTATTTCAGT TTACTATACT TTGTGCTAAA 5520 

TTAAAGATAT GAAACATGAT TTTAACCACA AAGCAGAAAC TTTCGATTCC CCTAAAAATA 5580 

TCTTCCTCGC AAACTTGGTA TGTCAAGCAG CCGAGAAACA GATTGATCTT CTATCAGACA 5640 

AAGAAATTTT AGATTTCGGT G6TGGCACGG GTCTATTAGC CTTGCCCCTA ACCCCTAGCC 5700 

AAGCAGGCTA AGTCAGTCAC TCTTGTAGAC ATTTCTGAGA AAATGTTGGA GCAAGCTCGT 5760 

TTGAAAGTGG AGCAGCAAGC AATCAAGAAT ATCCAGTTTT TGGAGCAAGA TTTACCGAAA 5820 

AATCCCTTGG AGAAAGAGTT TGATTGCCTT GCTGTTAGTC GGGTTCPTCA TCATATGCCT 5880 

GATTTGGATG CGGCTCTCTC ACTGTTTCAT CAACATTTGA AGGAAGATGG GAAACTCATC 5940 

ATTGCTGATT TTACCAAGAC AGAAGCTAAT CATCATGGAT TTGATTTAGC TGAACTGGAA 6000 

AACAAGCTAA TTGAGCATGG TTTTTCATCT GTGCATAGTC AGATTCTCTA TAGTGCTGAA 6060 

GACCTGTTTC AAGGAAATCA CTCAGAATTC TTTTTAATAG TAGCCCAAAA ATCACTCGCC 6120 

TAGTCAGGGA GTGATTTTTC TATAAGGATG GAAAAAAGAA GGGAAATTTG GTAAGATAGG 6180 

AATATGGATT TTGAAAAAAT TGAACAAGCT TATACCTATT TACTA6AGAA TGTCCAAGTC 6240 

ATCCAAAGTG ATTTOGCGAC CAACTTTTAT GACGCCTTG6 TGGAGCAAAA TAGCATCTAT 6300 

CTGGATGGTG AAACTGAGCT AAACCA6GTC AAGGAGAACA ATCAAACCCT TAA6CGTTTA 6360 

GCACTACGCA AAGAAGAATG GCTCAAGACC TACCAGTTTC TCTTGATGAA 6GCTGGGCAA 6420 

ACAGAACCCT TGCAG6CCAA TCACXy^TTT ACACCGGATG CTATTGCTTT GCTTTTGGTG 6480 

TTTATTGTGG AAGAGTTGTT TAAAGAGGAG GAAATTACTA TCCTCGAAAT GGGTTCTGGG 6540 

ATGGGAATTC TAGGCGCTAT TTTCTTGACC TCGCTTACTA AAAAGGTGGA TTACTTGGGA 6600 

ATGGAAGTGG ATGATTTGCT GATTGATCTG GCAGCTAGCA TGGCAGATGT AATTGGTTTG 6660 

CAGGCTGGCT TTGTCCAAGG AGATGCCGTT CGCCCACAAA TGCTCAAAGA AAGCGATGTG 6720 

GTCATCAGT6 ACTTGCCTGT CGGCTATTAT CCTGATGATG CCGTTGCGTC GCGCCATCAA 6780 

GTTGCTTCTA GCCAAGAACA TACTTACGCC CATCACTTGC TCATGGAACA AGGGCTTAAG 6840 

TACCTCAAGT CAGACGGATA CGCTATTTTT CTAGCTCCGA GTGATTTGTT GACCAGTCCT 6900 

CAAAGTQATT TGTTAAAAGA ATGGCTGAAA GAAGAGGCGA GTCTGGTTGC TATGATTAGT 6960 

CTGCCTGAAA ATCTCTTTGC TAATGCCAAA CAATCTAAGA CTATTTTTAT CTTACAGAAG 7020 
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AAAAATGAAA TAGCAGTAGA GCCTTTTGTT TATCCACTTG CTAGCTTGCA AGATGCAAGT 7080 

GTTTTAATGA AATTTAAAGA AAATTTTCAA AAATGGACTC AAGGTACTGA AATATAAAAT 7140 

AGATTTTGTT ATAATAGTTG AAAACGCTTA AAAAGGGGTA TCATGTTATG ACAAAAACAA 7200 

TTGCAATCAA TGCAGGAAGT TCAAGTTTGA AATGGCAATT ATACTTAATG CCAGAAGAAA 7260 

AAGTATTGGC GAAAGGTTTG ATT6AACGTA TCGGTTTGAA AGATTCAATT TCAACTGTAA 7320 

AATTTGACGG CCGTTCTGAA CAACAAATTT TGGATATTGA AAATCATATA CAAGCCGTTA 73 80 

AAATTTTATT GGATGACTTG ATTCGTTTCG ATATTATCAA GGCTTATGAC GAGATTACAG 7440 

GTGTTGGACA TCX5TGTTGTT GCTGGTGGAG AATATTTCAA AGAATCAACA GTTGTTGAGG 7500 

GAGATGTTTT AGAAAAAGTT GAAGAGTTGA GTTTGTTGGC TCCTCTACAC AACCCGGCCA 7560 

ATGCAGCAGG TGTTCGTGCC TTCAAGGAAT T6TTGCCAGA CATTACCAGT GTAGTTGTTT 7620 

TTGATACTTC CTTCCACACA AGTATGCCAG AGAAAGCTTA TCGCTACCCT CTACCAACAA 7680 

AATATTACAC AGAAAACAAG GTTCGTAAAT ACGGTGCTCA TGGTACAAGT CACCAGTTTG 7740 

TAGCAGGAGA AGCTGCAAAA CTCTTGQGAC GTCCATTAGA AGACTTGAAG TTAATTACCT 7800 

GTCATATTGG TAACGGAGGC TCy^ATTACAG CTGTGAAAGC CGGCAAATCT GTAGACACTT 7860 

CTATGGGGTT CACTCCTCTT GGTGGTATTA TGATGGGAAC GCGTACAGGG GATATTGATC 7920 

CAGCTATCAT TCCTTATTTA ATGCAATATA CAGAGGATTT TAACACACCA GAAGATATCA 7980 

GTCGTGTTCT TAACCGTGAA TCAGGTCTTT TGGGAGTTTC TGCTAATTCT AGCGATATGC 8040 

GCGATATAGA AGCAGCTGTA GCAGAAGGGA ATCAC6AGGC TAGCTTGGCT TATGAAATGT 8100 

ATGTTGACCG TATCCAAAAA CATATCGGTC AGTACCTTGC AGTGCTAAAT GGAGCAGATG 8160 

CCATTGTTTT CACA6CAGGT GTCGGTGAAA ATGCAGAGAG TTTCCGTCGT GATGTAATCT 8220 

CAGGGATTTC GTGGTTTGGT TGTGATGTTG ATGATGAAAA GAATGTCTTT GGC6TTACAG 8280 

GAGACATCTC AACAGAGGCA GCTAAAATCC GTGTCTTGGT TATTCCAACA GATGAAGAAT 8340 

TAGTCATTGC CCGTGACGTT GAACGCTTGA AAAAATAAGT GAAACTAAAA AAATATTCAA 8400 

TACAAGGAGT TGGGAAAGTT ATTTTTCCAG CTTCTTTTTC TGATGAAATT GTCCAAAACC 8460 

TTGCTATGAT TGGCTTTTTT GAAAAATATG GTATAATAGT AGTAATTTAA TAGATGGAGT 8520 

TGAGTTTTGA AGAAAAACTT TCGTGTAAAA AGAGAGAAAG ATTTTAAGGC GATTTTCAAG 8580 

GAGGGGACAA GTTTTGCTAA TCGCAAATTT GTGGTCTACC AATTAGAAAA CCAGAAAAAC 8640 

CGTTTTCGAG TAGGTCTATC AGTTA6CAAA AAACTGGGGA ATGCCGTCAC TAGAAATCAA 8700 

ATTAAGCGAC GGATTCGGCA TATTATCCA6 AATGCAAAAG GGAGTCT6GT A6AAGATGTC 8760 
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GACTTTGTTG TCATTGCTCG AAAAGGAGTC GAAACCTTGG GATACGCAGA GATGGAGAAA 8820 

AATCTACTCC ATGTATTAAA ATTATCAAAG ATTTACCGGG AAGGAAATGG GAGTGAAAAA 8880 

GAAACTAAAG TTGACTAGTT TGCTAGGACT GTCTCTGTTA ATCATGACAG CCTGTGCGAC 8940 

TAATGGGGTA ACTAGCGATA TTACAGCCGA ATCGGCTGAT TTTTGGAGTA AATTGGTTTA 9000 

CTTCTTTGCG GAAATCATTC GCTTTTTATC GTTTGATATT AGTATCGGAG TGGGGATTAT 9060 

TCTCTTTACG GTCTTGATTC GTACAGTCCT CTTGCCAGTC TTTCAGGTGC AAATGGTGGC 9120 

TTCTAGGAAA ATGCAG6AAG CTCAGCCACG CATTAAGGCG CTTCGAGAAC AATATCCAGG 9180 

TCGAGATATG GAAAGCAGAA CCAAACTAGA GCAGGAAATG CGTAAAGTAT TTAAAGAAAT 9240 

GGGTGTCAGA CAGTCAGACT CTCTTTGGCC GATTTTGATT CAGATGCCGG TTATTTTGGC 9300 

CCTGTTCCAA GCCCTATCAA GAGTTGACTT TTTAAAGACA GGTCATTTCT TATGGATTAA 9360 

CCTTGGTAGT GTGGATACAA CCCTTGTTCT TCCGATTTTA GCAGCAGTAT TCACCTTTTT 9420 

AAGTACTTGG TTGTCCAACA AAGCTTTGTC TGAGC6AAAT GGCGCTACGA CT6CGATGAT 9480 

GTATGGGATT CCAGTCTTGA TTTTTATCTT TGCAGTTTAT GCGCCAGGTG GAGTCGCCCT 9540 

ATACTGGACA GTGTCTAATG CTTATCAAGT CTTGCAAACC TATTTCTTGA ATAATCCATT 9600 

CAAGATTATC GCAGAGCGCG AGGCCGTAGT ACAGGCACAA AAAGATTTGG AAAATAGAAA 9660 

AAGAAAAGCC AAGAAAAAGG CTCAGAAAAC GAAATAAATA AGGAGGAATC TGGTAGTGGT 9720 

AGTATTTACA GGTTCAACTQ TTGAAGAAGC AATCCAGAAA GGATTGAAAG AATTAGATAT 9780 

TCCAAGAATG AAGGCTCATA TCAAAGTCAT TTCTAGGGAG AAAAAAGCCT TTCTTGGTCT 9840 

ATTTGGTAAA AAACCAGCCC AAGTGGATAT TGAAGCGATT AGTGAAACGA CT6TTGTCAA 9900 

AGCAAATCAA CAGGTAGTAA AAGGCGTTCC GAAAAAAATC AATGATTTGA ACGAGCCTGT 9960 

GAAGACGGTT AGTGAAGAAA CCGTTGACCT TGGTCATGTG GTTGATGCTA TTAAAAAAAT 10020 

AGAGGAAGAA GGTCAAGGTA TTTCTGATGA AGTCAAGGCT GAAATCTTAA AACATGAAAG 10080 

ACATGCCAGC ACTATCTTAG AAGAAACTGG TCACATTGAG ATTTTAAATG AACTTCAAAT 10140 

CGAGGAAGCG ATGAGGGAAG AAGCAGGCGC TGATGACCTT GAAACTGAGC AAGACCAAGC 10200 

TGAAAGTCAA GAACTAGAAG ACPTGGGCTT GAAAGTTGAA ACGAACTTTG ATATTGAACA 10260 

AGTAGCTACG GAAGTAATGG CTTATGTTCA AACGATTATT GATGACATGG ATGTTGAQGC 10320 

TACACTTTCA AATGATTATA ACCGTCGTAG CATCAATCTA CAAATTGACA CCAACGAACC 10380 

AGGTCGTATT ATCGGCTACC ATGGTAAAGT CTTGAAGGCC TTGCAACTGT TGGCTCAAAA 10440 

TTATCTTTAC AACCGCTATT CCAGAACCTT CTACGTTACA ATCAATGTCA ATGATTATGT 10500 

CGAACACCGT GCAGAAGTCT TGCA6ACCTA TGCGCAAAAA TTGGCGACTC GTGTTTTGGA 10560 
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AGAAGGGCGC 


AGTCATAAAA 


CAGATCCAAT 


GTCAAATAGC 


GAACGCAAGA 


TTATCCATCG 


10620 


TATTATTTCA CGTATGGATG 


GCGTGACTAG 


TTACTCTGAA 


GGTGATGAGC 


CAAATCGCTA 


10680 


TGTTGTTGTA GATACAGAAT 


AAGTAAAATC 


AGGTTTATCC 


TGATTTTTTG 


CTAGTTAGAG 


10740 


GAGGTTAAAC 


TGATGTTGAA 


TAAGATAAGA 


GACTATTTAG 


ACTTTGCTGG 


TTTGCAGTAC 


10800 


CGTAATCCTG 


ATAAAGCGGG 


AGCAGAGCGA 


GAGAAGATGC 


TGGCATTCCG 


CCACAAAGGA 


10860 


CAAGAGGCCC 


GAAAGGTTTT 


TACAGAACTG 


GCCAAAGCCT 


TTCAAGCAAG 


CCATCCAGAA 


10920 


TGGCAACTCC 


AACAGACTAG 


CCAGTGGATG 


AATCAGGCCC 


AGCGTTTGAG 


ACCAGATTTT 


10980 


TGGGTTTATC 


TACAGAGAGA 


CGGACAAGTG 


ACAGAACCTA 


TGATGGCCTT 


ACGTTTGTAT 


11040 


GGGACATCTA CTGACTTTGG 


AATTTCTTTG 


GAAGTCAGTT 


TCATCGAACG 


TAAGAAGGAT 


11100 


GAGCAAACAC 


TGGGCAAGCA 


GGCCAAAGTT 


TTAGACATTC 


CAACCGTTAA AGGGATTTAT 


11160 


TATCTAACCT 


ACTCTAATGG 


TCAAAGTCAA 


CGGTGGGAGG 


CGAATGTU^GA AAAGCGTCGT 


11220 


ACTTTACGCG 


AGAAGGTGAG 


AAGTCAAGAA 


GTTCGAAAAG 


TTTTAGTGAA 


GGTAGATGTT 


11280 


CCTATGACAG 


AAAATTCGTC 


TGAAGAA6AA 


ATCGTAGAAG 


GCTTATTGAA 


GTCTTATTCT 


11340 


AAAATTCTTC 


CCTATTATCT 


AGCTACGAGA 


AAATAAGATA 


ATTTGTAAAA 


CATCATAAAT 


11400 


CATACAGTCC 


AAGAGTGAAC 


AGTCCGCTGT 


GTAATTCTTG 


GTCTTTTTGT 


TTGCGCTTTC 


11460 


GCATTATATA 


ATAAACTTAC 


AAAAACAATT 


CAAAAGGAGA 


ACAATTATGG 


AAGTCGTTTC 


11520 


AAGTGTTCTA 


AATTGGTTTT 


CTAGCAATAT 


TTTGCAGAAT 


CCCGCATTTT 


TCGTAGGTTT 


11580 


ATTGGTGTTG 


ATAGGATATG 


CACTTTTGAA 


AAAACXTGCC 


CATGACGTTT 


TTTCAGGGTT 


11640 


TGTTAAAGCA ACAGTAGG6T 


ATATGTTGCT 


TAACGTGGGT 


GCTGGTGGTT 


TGGTTACAAC 


11700 


CTTTCGTCCA 


ATCTTAGCAG 


CTCTTAACTA 


CAAATTCCAA 


ATTGGTGCAG 


CGGTTATCGA 


11760 


CCCTTACTTT G6ACTTGCTG 


CAGCAAACAA 


CAAAATTGTA 


GCAGAGTTTC 


CAGATTTTGT 




TG6AACTGCA 


ACTACAGCTC 


TATTGATTGG 


TTTTGGAATA AATATCTTGC TCGTAGCTCT 


11880 


TCGAAAGATT 


ACGAAGGTAA 


GAACCCTCTT 


TATTACTGGT 


CACATCATGG 


TACAACAAGC 


11940 


TGCAACAGTA TCTCTTATGG 


TTCTATTCTT 


AGTACCACAA 


TTGCGCAATG 


CTTACGGTAC 


12000 


AGCAGCGATT 


GGTATCATCT 


GTGGACTTTA 


CTGGGCAGTT 


AGTTCAAATA 


TGACTGTTGA 


12060 


GGCAACTCAA CGCTTGACTG 


GTGGTGGCGG 


ATTTGCGATT 


GGTCACCAAC 


AGCAATTTGC 


12120 


AATCTGGTTT GTAGATAAA6 


TAGCAGGACG 


CTTT6GTAAG 


AAAGAAGAAA GTTTAGACAA 


12180 


TCTTAAATTA CCTAAGTTCX: TCTCAATCTT CCACGATACA GTTGTTGCAT CTGCTACCTT 


12240 


GATGCTCGTA 


TTCTTCGGAG 


CCATTCTTTT 


AATCTTGGGT 


CCAGACATTA TGTCTAATAA 


12300 
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AGAAGTCATC ACTTCAGGAA CTCTATTCAA TCCTGCTAAA CAAGATTTCT TTATGTACAT- 12360 

TATCCAAACA GCCTTTACCT TCTCAGTTTA CTTGTTCGTT TTGATGCAAG GTGTCCGAAT 12420 

GTTCGTATCT GAGTTGACAA ACGCCTTCCA AGGTATTTCA AACAAATTGT TGCCAGGTTC 12480 

ATTCCCAGCG GTTGACGTTG CAGCTTCTTA TGGATTTGGT TCTCCAAATG CTGTCTTGTC 12540 

AGGATTTACC TTTGGTTTGA TTGGTCAATT GATTACAATT GTTTTGCTCA TCGTCTTTAA 12600 

AAATCCGATT CTTATTATTA CAGGATTTGT ACCAGTGTTC TTTGACAATG CAGCCATTGC 12660 

GGTCTACGCT GATAAACGCG GCGGATGGAA AGCGGCTGTT ATCCTTTCCT TTATATCAGG 12720 

TGTCCTTCAA GTTGCTCTAG GAGCTCTTTG TGTGGCCCTT CTCGATTTGG CATCTTATGG 12780 

TGGCTACCAT GGAAATATCG ACTTTGAATT CCCATGGCTT GGATTTGGAT ATATCTTCAA 12840 

ATACCTTGGT ATTGTTGGTT ATGTACTTGT OTGTCTCTTC TTGCTTGTTA TTCCTCAACT 12900 

TCAATTTGCC AAAGCAAAAG ATAAAGAGAA ATATTACAAC GGT6AAGTTC AAGAAGAAGC 12960 

TTAGTATCTA GAAAAGGAGA AATAAAATGG TTAAAGTATT AGCAGCGTGC GGAAATGGAA 13020 

TGGGTTCATC AATGGTTATC AAGATGAAGG TTGAAAATGC TCTCCGTAAG CTTAATCAAA 13080 

CAGATTTTAC AGTCAATTCA TGCAGTGTCG GTGAAGCTAA AGGTTTAGCA GTAGGATATG 13140 

ACATCGTAAT CGCTTCTCTT CATTTGATTC AAGAATTGGA AGGGCGAACT AATGGGAAGT 13200 

TAATTGGGCT TGATAACTTG ATGGATGATA AAGAAATCAC CGAAAAACTC AGTCAAGCAC 13260 

TACAGTAAAA GGTTGGAGGG GGCTGGACAG AAACTGAGAG TTATCGTTTC TGTCCTTCTC 13320 

CCTCTTTAAA TAAAGGAGGC AGATAT6AAT TTAAAACAAG CTTTAATTGA CAATGACTCG 13380 

ATCCGACTAG GTTTAGAGGC TAACAATTGG AAAGAAGCAG TCAAGGTAGC AGTAGATCCC 13440 

TTAATTGAAA GTGGGGCAAT TTTGCCAGAG TATTACGATG CTATCATTGA ATCGACTGAA 13500 

GAGTATGGGC CTTACTATAT CTTGATGCCA GGTATGGCTA TGCCCCACGC TAGACCTGAA 13560 

GCAGGTGTGC AAAGTGATGC CTTTTCATTG ATTACCTTAC AAAATCCTGT TGTATTTTCA 13620 

GATGGGAAAG AGGTATCTGT TTTGTTGGCA CTAGCAGCAA CAAGTTCAAA AATTCACACA 13680 

AGTGTAGCCA TTCCACAAAT TATTGCCCTA TTTGAATTAG AAGATTCTAT TGCACGTTTA 13740 

CAGGCTTGCC AGACTAAAGA AGATGTCTTG GCTATGATTG AAGAATCT7UV GGATAGCCCT 13800 

TATCTCGAAG GATTGGATTT GGAAAGTTAG AAAGAGGAAT AAAGAAATGA CAAAAAGAAT 13860 

ACCTAATTTA CAAGTTGCAT TAGACCATTC AGACTTGCAA GGAGCX3ATTA AAGCAGCTGT 13920 

TTCTGTTGGT CAGGAAGTAG ATATTATCGA AGCTGGAACT GTTTGCTTGC TTCAAGTTGG 13980 

AAGTGAACTG GCTGAAGTCT TGCX3TA6CCT TTTCCCAGAT AAGATTATTG TGGCAGACAC 14040 

AAAATGTGCT GATGCTGGTG GAACAGTTGC TAAAAATAAT GCGGTTCGTG GAGCAGACTG 14100 



W0 98/1893i 



PCT/US97/19588 



383 



GATGACTTGT ATCTGTTGTG CAACCATCCC TACTATGGAA GCAGCTCTAA AGGCTATCAA 


14160 


GACTGAACGA 


GGAGAACGAG GCGAAATCCA GATCGAGCTT 


TATGGCGATT 


GGACTTTTGA 


14220 


ACAAGCTCAG 


CTTTGGCTAG ATGCAGGTAT CTCACAAGCT 


ATTTATCACC 


AATCTCGTGA 


14280 


TGCTCTTCTT 


GCTGGTGAAA CTTGGGGTGA AAAAGACCTT 


AATAAGGTTA 


AAAAACTCAT 


14340 


TGACATGGGC 


TTCCGTGTAT CTGTAACAGG TGGTCTAGAT 


GTAGATACTC 


TCAAACTCTT 


14400 


TGAAGGTATT 


GATGTCTTTA CCTTTATCGC AGGTCGTGGA ATTACAGAGG 


CTGTGGATCC 


14460 


AGCAGGAGCA GCGCGTGCCT TCAAGGATGA AATCAAACGA ATTTGGGGGT AAATCATGGT 


14520 


ACGTCCAATT 


GGAATTTATG AAAAGGCAAC CCCAACACAC 


TGTACTTGGC 


TAGAACGTTT 


14580 


AAATTTTGCC 


AAGGAGTTAG GCTTTGATTT TGTCGAGATG 


TCTATTGACG 


AACGTGACGA 


14640 


GCGTTTAGCA AGACTTGACT GGAGTAAGGA AGAACGCTTG GAAGTTGTCA AAGCAATCTA 


14700 


T6AAACTGGT 


GTTCGTATTC CTTCTATCTG TTTTTCAGGC 


CATCGTCGCT 


ACCCATTGGG 


14760 


TTCAAAAGAT 


CCAGTTCTA6 AGGAAAAATC TCTAGAACTC 


ATGAAAAAAT 


GTATCGAATT 


14820 


AGCTCAAGAC 


TTGGGAGTTC GTACGATTCA ATTAGCTGGT 


TACGATGTTT 


ACTATGAGGA 


14880 


AAAGTCACCC 


CAGACACGCC AACGTTTTAT CAAAAATTTG 


AGAAAAGCCT 


GTGACTGGGC 


14940 


TGAAGAAGCT 


CAGGTGGTAC TTGCTATTGA AATTATGGAT 


GATCCTTTCA 


TCAGTAGCAT 


15000 


CGAAAAATAT TTGGCTATAG AAAAAGAGAT TGACTCTCCC TTCCTCTTTG TATATCCAGA 


15060 


TATTGGTAAT 


GTGTCTGCAT GGCATAATGA TATCTATAGT 


GAGTTTTATC 


TTGGTCATCA 


15120 


TGCCATCGCA 


GCTCTCCATC TCAAGGATAC TTATGCAGTG 


ACAGAAAGTT 


CAAAGGGCCA 


15180 


GTTCCGAGAT 


GTACCTTTCG GGCAAGGTTG TGTCAAATGG 


GAAGAAGCTT 


TCGATATTTT 


15240 


AAAGGAAACC 


AATTATAATG GACCTTTCCT AATCGAAATG 


TGGTCTGAAA 


ATTGTGAAAC 


15300 


AGTAGAAGAA 


ACACGCGCAG CCATTCAAGA GGCGCAAGCT 


TTTCTCTATC 


CACTCATTAA 


15360 


GAAAGCAGGT 


TTGATGTAAG ATGAATCAAG TAATCAATGC 


TATGCGTAAA 


CGAGTCTGTG 


15420 


ATGCCAATCA 


ATCATTGCCA AAACATGGAC TTGTCAAATT 


TACCTGGGGG 


AATGTATCTG 


15480 


AAGTTAATCG 


CGAACTCGGT GTCATTGTTA TCAAACCATC 


AGGCGTGGAT 


TATGACGAAT 


15540 


TGACACCTGA 


AAACATGGTA GTGACTGATC TAGATGGTAA 


GATCCTAGAA 


GGGGATTTAA 


15600 


GACCATCTTC CGACCTCCCA ACTCATGTGC AATTATATAA GACTTGGTCA GAAATTGGTA 


15660 


6TGTGGTTCA 


CACCCATTC6 ACAGAAGCTG TTGGTTGGGC 


TCAGGCAGGT 


CGTGATATTC 


15720 


CTTTCTACGG AACAACCCAT GCAGATTATT TCTACGGTTC 


AATCCCTTGC 


GCCCGTAGTT 


15780 


TGACCAAGGA 


CGAAGTAGAA GTGGCCTATG AAAAAGATAC TGGCCTGGTT ATCGTAGAAG 


15840 
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AGTTTGAACA TCGCGGACTT 


AACCCGGTTG AAGTACCAGG 


AATTGTTGTA 


CGCAATCACG 


15900 


GTCCATTCAC 


CTGGGGCAAA 


AATCCAGAGA ATGCTGTTTA 


TCACTCTGTC 


GTACTAGAGG 


15960 


AAGTATCAAA GATGAATCGC 


TTTACAGAAC 


AAATCAATCC 


AAGAGTTGGA 


CCTGCTCCCC 


16020 


AGTACATACT 


AGAAAAACAC 


TACCAACGTA 


AACATGGACC 


AAATGCTTAT 


TATGGTCAAA 


16080 


AGTAAGAACG 


ATGAAGGAGG 


AGAAAAAGAT 


AAATTTAGCT 


CCTCTTTTTA 


CATTTGATTT 


16140 


TTATTGAGAG 


TAAAGTTGGA 


GTTGAAGTAA 


TTTTAAAAGA 


TTTTTTAGAA 


ATAGCGCTTG 


16200 


ATATATATAT 


GGTAAAATAA 


AAAGAATTGC 


TGTGATATCA 


ATAGATTTGG 


GGGATTTTTT 


16260 


AATATGGTAC 


TGGATAAGGC 


AAGTTGTGAT 


TTGCTTCAAT 


ATTTGATGGA 


TCAAGAAACG 


16320 


TCCAAAACGA 


TTATGGCGAT 


TTCGAAAGAT 


TTGAAAGAGT 


CAAGAAGGAA 


AATTTATTAT 


16380 


CACATTGACA 


AAATCAATGC 


TGCTCTGGGT 


GACGAGGCGC 


TTCACATCAT 


TAGTATTCCA 


16440 


CGAATTGGTA 


TTCACTTAAC 


G6AAGAGCAG AGAGATGCTT 


GTTGTAAACT 


ATTATCQGAA 


16S00 


GTAGATTCGT 


ACGATTATAT 


CATGAGTGCG 


CATGAACGTA TGATGATAAT GTTACTATGG 


16560 


ATAGGTATTT 


CTAAAGAACG 


TATTACGATT 


GAAAAATTGA 


TAGAGTTAAC 


AGAGGTATCT 


16620 


AGGAATACTG 


TTCTCAATGA 


TTTGAATAGT 


ATTCGTTATC 


AACTAACTTT 


GGAACAATAT 


16680 


CAGGTGATCT 


TGCAAGTGAG 


CAAGTCACAG 


GGATACAACC 


TTCATGCCCA 


CCCTCTTAAT 


16740 


AAAATTCAGT 


ATCTTCAATC 


GCTTCTATAT 


CATATTTTTA 


TGGAAGAAAA 


TGCCACTTTT 


16800 


GTATCTATTT 


TAGAAGATAA 


GATGAAAGAG 


AGGTTAGATG 


ATGAGTGTTT 


GCTTTCTGTT 


16860 


GAAATGAACC 


AATTTTTTAA 


GGAACAGGTT 


CCTTTAGTTG 


AACAAGATTT 


AGGGAAGAAA 


16920 


ATAAACCATC 


ATGAAATAAC 


TTTTATGTTG 


CAGGTTCTAC 


CTTATTTGCT 


GTTAAGCTGT 


16980 


CATAATGTTG AACAGTATCA AGAAAGACAT CAGGATATAG AGAAAGAATT 


TTCTTTGATA 


17040 


AGAAAAAGAA 


TAGAGTATCA 


GGTGTCTAAG 


AAATTAGGAG 


AACGGTTGTT 


TCAAAAGTTT 


17100 


GAAATTTCTT 


TGTCAGGACT 


TGAAGTTTCT 


CTTGTAGCTG 


TTCTCCTCCT 


CTCCTATCGT 


17160 


AAAGATTTGG 


ATATTCATGC 


AGAAAGTGAT 


GATTTTCGGC 


AATTAAAACT 


TGCTTTAG^'iA 


17220 


GAATTTATCT GGTATTTTGA ATCACAAATC CGAATGGAGA TTGAGAACAA GGATGATTTG 


17280 


TTACGAAATT 


TGATGATCCA 


CTGTAAAGCC 


TTGTTATTTA GAAAGACTTA 


CGGTATTTTT 


17340 


TCTAAAAATC 


CTCTAACAAA 


ACAAATTCGA TCCAAGTATG 


GAGAATTATT 


TTTAGTCACT 


17400 


AGAAAATCT6 


CGGAAATTTT 


AGAAGGAGCA TG6TTTATTC 


GGCTAACAGA 


CGATGATATT 


17460 


GCCTATTTGA CGATTCATAT 


TGGAGGATTT TTAAAATATA C3VCCATCATC 


TCAAAAAAAT 


17520 


ATGAAAAAAG 


TTTATCTCGT 


TTGTGATGAA GGTGTTGCGG TTTCGAGACT TTTGCTGAAA 


17580 


CAATGCAAAC 


TTTATTTTCC 


AAATGAGCAA 


ATTGACACTG 


TATTTACAAC 


AGAACAATTT 


17640 
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AAGAGTGTGG 


AAGATATTGC ACAAGTTGAT GTAGT6ATTA CTACTAATGA TGATTTGGAT 


17700 


AGCAGATTTC 


CGATTTTAAG GGTTAATCCT ATCCTTGAAG CAGAAGATAT TTTGAAAATG 


17760 


CTAGACTATC 


TTAAACACAA TATATTTCGT AATAAGAGCA AAAGTTTCAG TGAAAATCTT 


17820 


TCTAGTCTTA 


TTTCGTCTTA TATTGTAGAC AGCAAGTTGG CTAGTAAGTT CCAAGAAGAG 


17880 


GTTCAAACAC 


TTATTIAATCA AGAAATAGTA GTTCAAGCTT TTTTGGAAGr TATTTGAAGG 


17940 


ACAGTCCAAT 


GATGAACACA AACCTGTGTk TTTCsTGGTC TTTTtTAGTG TTTTGAAGGG 


18O00 




nl^lUnnAuA lAACAAi lAl AivwAAAuuA uuUA/iL.AiAl 


18060 


AAAGAAATTA 


CAAGAGA6TC ATGGATTTTA GCCACTTTCC CAGAGTGQGG AACATGGTTG 


18120 


AACGAAGAAA 


TCGAAGAAGA AGTCGTACCT GAAGGCAACT TTGCCATGTG GTGGCTAGGC 


18180 


AACTGTGGTA 


CTTGGATTAA GACACCAGCT GGT6CTAACG TTGTCATGGA CCTTTGGTCA 


18240 


AACCX?rGGAA 


AATCAACCAA AAAAGTGAAA GATATGGTTC GTGGGCACCA AATGGCAAAT 


18300 


ATGGCAGGTG 


TTCGTAAGCT GCAACCAAAC TTGCGTGTTC AGCCAATGGT TATCGATCCA 


18360 


TTTGCTATCA 


ACGAACTAGA CTATTACTTA GTTTCACACT TCCACAGTGA TCATATCGAC 


18420 


CCATACACAG 


CTGCAGCAAT TCTCAATAAT CCTAAGTTAG AGCATGTTAA GTTGG 


18475 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7186 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCAGGATTTG GTACCGTTGC AAGTGGTGTG CCTTTCCTCC TAAAGGAAAA TGGAGGAAAA 60 

ATCAATCAAT CAGCACATTC AGATATCAAA GTTGCTAAGG TATTGGTCAA GGATGAAGAT 120 

GAAAAAAATC GCTTGCTTGC AGCAGGGAAT GACTTTAACT TTGTAACCAA TGTGGATGAT 180 

ATTTTATCAG ACCAGGATAT TACTATCGTA GTGGAATTGA TGGGGCGTAT TGAGCCTGCT 240 

AAAACCTTTA TCACTCGTGC CTTGGAAGCT GGAAAACACG TTGTTACTGC TAACAAGGAC 300 

CTTTTAGCTG TCCATGGCGC AGAATTGCTA GAAATCGCTC AAGCTAACAA GGTAGCACTT 360 

TACTACGAAG CAGCAGTTGC TGGTGGGATT CCAATTCTTC GTACTTTAGC AAATTCCTTG 420 
GCTTCTGATA AAATTACGCG CGTGCTTGGA GTAGTCAACG GAACTTCCAA CTTCATGGTG . 480 

ACCAAGATGG TGGAAGAAGQ CTGGTCTTAC GATGATGCTC TTGCGGAAGC ACAACGTCTA 540 
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GGATTTGCAG AAAGCGATCC GACGAATGAC GTAGATGGGA TTGATGCAGC CTACAAGATG 600 

GTTATTTTGA GCCAATTTGC CTTTGGCATG AAGATTGCCT TTGATGATGT AGCCCACAAG 660 

GGAATCCGCA ATATCACACC AGAAGACGTA GCTGTAGCTC AAGAGCTTGG TTACGTAGTG 720 

AAATTGGTTG GTTCTATTGA GGAAACTTCT TCAGGTATTG CTGCAGAAGT GACTCCAACC 780 

TTCCTACCTA AAGCGCACCC ACTTGCTAGT GTGAATGGCG TAATGAACGC TGTCTTTGTA 840 

GAATCTATCG GTATTGGTGA GTCTATGTAC TACGGACCAG GTGCGGGTCA AAAACCAACT 90 0 

GCAACAAGTG TTGTAGCTGA TATTGTCCGT ATCGTTCGTC GTTTGAATGA TGGTACTATT 960 

GGCAAAGACT TCAACGAATA TAGCC6TGAC TTGGTCTTGG CAAATCCTGA AGATGTCAAA 1020 

GCAAACTACT ATTTCTCAAT CTTGGCTCTA GACTCAAAAG 6TCAGGTCTT GAAGTTGGCT 1080 

GAAATCTTCA ATGCTCAAGA TATTTCCTTT AAGCAAATCC TTCAAGATGG CAAAGAGGGT 1140 

GACAAGGCGC GTGTCGTTAT CATCACACAC AAGATTAATA AAGCCCAGCT TGAAAATGTC 1200 

TCAGCTGAAT TGAAGAAGGT TTCAGAATTC GACCTCTTGA ATACCTTCAA GGTGCTAGGA 1260 

GAATAAGATG AAGATTATTG TACCTGCAAC CAGTGCCAAT ATCGGGCCAG GTTTTGACTC 1320 

GGTCGGTGTA GCTGTAACCA AGTATCTTCA AATTGAGGTC TGCGAAGAAC GAGATGAGTG 1380 

GCTGATTGAA CACCAGATTG GCAAATGGAT TCCACATGAC GAGC6TAATC TCTTGCTCAA 1440 

AATCGCTTTG CAAATTGTAC CAGACTTGCA ACCAAGACGC TTGAAAATGA CCAGTGATGT 1500 

CCCTTTGGCG CGCGGTTTGG GTTCTTCCAG CTCGGTTATC GTTGCTGGGA TTGAACTAGC 1560 

CAACCAACTG GGTCAACTCA ACTTATCAGA CCATGAAAAA TTGCAGTTAG CGACCAAGAT 1620 

TGAAGGGCAT CCTGACAATG TGGCTCCAGC CATTTATGGT AATCTCGTTA TTGCAAGTTC 1680 

TGTTGAAGGG CAAGTCTCTG CTATCGTAGC AGACTTTCCA GAGTGTGATT TTCTAGCTTA 1740 

CATTCCAAAC TATGAATTAC GTACTCGCGA CAGCC6TAGT GTCTTGCCTA AAAAATTGTC 1800 

TTATAAGGAA GCTGTTGCTG CAAGTTCTAT CGCCAATGTA GCGGTTGCTG CCTTGTTGGC 1860 

AGGAGACATG GTGACCGCTG GGCAAGCAAT CGAGGGAGAC CTCTTCCATG AGCGCTATCG 1920 

TCAGGACTTG GTAAGAGAAT TTGCGATGAT TAAGCAAGTG ACCAAAGAAA ATGGGGCCTA 1980 

TGCAACCTAC CTTTCTGGTG CTGG6CCGAC AGTTATGGTT CTGGCTTCTC ATGACAAGAT 2040 

GCCAACAATT AAGGCAGAAT TGGAAAAGCA ACCTTTCAAA GGAAAACTGC ATGACTT6AG 2100 

AGTTGATACC CAAGGTGTCC GTGTAGAAGC AAAATAAAGA ATAGAAGATA GGATGGG6AA 2160 

ACTCTTGACC AGAGGGGTTC ATATCCTTTT TGTGAAAAGA AGTTTATACT CAATGAAAAT 2220 

CAAAGA6CAA ACTAGGAAGC TAGCCGCAGG CTGCTCAAAA CAGTGTTTTG AGGTTGCAGA 2280 

TAGAACTGAC GAAGTCAGCT CAAGACACTG TTTTGAGGTT GCAGATAGAA CTGACGAAGT 2340 
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CAGTAACCAT ACTACGGTAA GGTGACGCTG ACGTGGTTTG AAGAGATTTT CGAAGAGTAT 2400 

TAGTTAAAAA CGTGATAAAG GAGAAATAAA GATGGCAGAA ATTTATCTAG CAGGTGGTTG 2460 

TTTTTGGGGC CTAGAGGAAT ATTTTTCACG CATTTCT6GA GTGCTAGAAA CCAGTGTTGG 2520 

CTACGCTAAT GGTCAAGTCG AAACGACCAA TTACCAGTTG CTCAAGGAAA CAGACCATGC 2580 

AGAAACGGTC CAAGTGATTT ACGATGAGAA GGAAGTGTCA CTCAGAGAGA TTTTACTTTA 2640 

TTATTTCCGA GTTATCGATC CTCTATCTAT CAATCAACAA GGGAATGACC GTGGTCGCCA 2700 

ATATCGAACT GGGATTTATT ATCAGGATGA AGCAGATTTG CCAGCTATCT ACACAGTGGT 2760 

GCAGGAGCAG GAACGCATGC TGGGTCGAAA GATTGCAGTA GAAGTGGAGC AATTACGCCA 2820 

CTACATTCTG GCTGAAGACT ACCACCAAGA CTATCTCAGG AA6AATCCTT CAGGTTACTG 2880 

TCATATCGAT GTGACCGATG CTGATAAGCC ATTGATTGAT GCAGCAAACT ATGAAAAGCC 2940 

TAGTCAAGAG GTGTTGAAGG CCAGTCTATC TGAAGAGTCT TATCGTGTCA CACAAGAAGC 3000 

TGCTACAGAG GCTCCATTTA CCAATGCCTA TGACXTAAACC TTTGAAGAGG GGATTTATGT 3060 

AGATATTACG ACAGGTGAGC CACTCTTTTT TGCCAAGGAT AAGTTTGCTT CAGGTTGTGG 3120 

TTGGCCAAGT TTTAGCCGTC CGATTTCCAA AGAGTTGATT CATTATTACA AGGATCTGAG 3180 

CCATGGAATG GAGCGAATTG AAGTTCXSTTC TCGTTCAGGC AGTGCTCACT TGGGTCATGT 3240 

TTTCACAGAT GGACCGCGGG AGTTAGGCGG CCTCCGTTAC TGTATCAATT CTGCTTCTTT 3300 

ACGCTTTGTG GCCAAGGATG AGATGGAAAA AGCAGGATAT GGCTATCTAT TGCCTTACTT 3360 

AAACAAATAA AACAGAGAGT GGGGCTTCCC ACTTTCTTCA TTTCTAGAAT ATGAATAGAA 3420 

GGGATTTATG AAACACCTAT TATCTTACTT CAAACCCTAC ATCAAGGAAT CAATTTTAGC 3480 

CCCCTTGTTC AAGCTGTTAG AAGCTGTTTT TGAGCTCTTG GTTCCCATGG TGATTGCTGG 3540 

GATTGTTGAC CAATCTTTAC CTCAGGGAGA TCAAGGTCAT CTCTGGATGC AGATTGGCCT 3600 

GCTCCTTATC TTTGCAGTAA TTGGCGTTTT AGTGGCCTTG ATAGCTCAAT TTTACTCAGC 3660 

AAAGGCAGCA GTAGGTTCTG CTAAGGAATT GACAAACGAT CTTTATCGTC ATATTCTTTC 3720 

CTTGCCCAAG GACAGCAGAG ACCGTCTGAC AACTTCTAGT TTGGTCACTC GCTTGACTTC 3780 

GGATACCTAC CAGATTCAGA CTGGTATCAA TCAATTCCTG CGTCTCTTTT TACGAGCGCC 3840 

CATTATCGTT TTTGGTGCCA TTTTTATGGC TTATCGAATC TCAGCTGAGT TGACTTTCTG 3900 

GTTCTTAGTC TTGGTTGCXai TTTTGACCAT TGTCATTGTA GGGTTATCTC GATTGGTCAA 3960 

TCCTTTCTAC AGTAGTCTCA GAAAGAAAAC GGACCAACTG GTTCAGGAAA CGCGCCAGCA 4020 

ATTGCAAGG6 ATGCGGGTTA TTCGTGCTTT TGGTCAAGAA AAACGAGAGT TACAGATTTT 4080 
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TCAAACCCTT AACCAAGTTT ATGCTAGATT ACAAGAAAAG ACAGGTTTCT GGTCTAGnT 4140 

ATTAACACCT CTGACCTATC TGATTGTCAA TGGAACTCTT CTCGTTATTA TCTGGCAAGG 4200 

CTATATTTCA ATTCAAGGAG GAGTGCTCAG TCAAGGTGCT CTCATTGCTC TTATCAATTA 4260 

CCTCTTACAG ATTTTGGTGG AATTGGTCAA GCTAGCCATG TTGATCAATT CCCTCAACCA 4320 

GTCCTATATC TCAGTCAAGC GAATCGAGGA AGTCTTTGTT GAGGCTCCAG AGGATATCCA 4380 

TTCAGAGTTA GAACAAAAGC AAGCTACCAG AGATAAGGTT TTACAAGTCC AAGAATTGAC 4440 

CTTTACCTAT CCTGATGCGG CCCAGCCTTC TCTGAGATAC ATTTCCTTTG ATATGACTCA 4500 

AGGACAAATT CTAGGTATCA TCGGGGGAAC TGGTTCTGGT AAATCAAGCT TGGTGCAACT 4560 

CTTACTTGGA CTTTATCCAG TAGACAAGGG GAACATTGAC CTTTATCAAA ATGGACGTAG 4620 

TCCTCTTAAT TTGGAGCAGT GGCGGTCTTG 6ATTGCCTAT GTACCTCAAA AGGTCGAACT 4680 

CTTTAAAGGA ACCATTCGTT CCAACTTGAC TCTAGGTTTC AATCAAGAAG TATCTGACCA 4740 

GGAACTCTGG CAG6CCTTGG AGATTGCGCA AGCTAAGGAT TTTGTCAGTG AAAAGGAAGG 4800 

ACTCTTGGAT GCTCTAGTTG AGGCAGGGGG GCGAAATTTC TCAGGTGGAC AAAAACAAAG 4860 

ATTGTCTATC GCCCGAGCAG TCTTGCGCCA GGCTCCGTTT CTCATCCTAG ATGATGCAAC 4920 

CTCGGCACTG GATACCATTA CAGAGTCCAA GCTCTTGAAA GCTATTAGAG AAAATTTTCC 4980 

AAACACGAGC TTAATTTTGA TCTCTCAACG AACCTCAACT TTACAGATGG CGGACCAGAT 5040 

TCTCCTCTTG GAAAAAGGT6 AGTTGCTAGC TGTTGGCAAG CACGATGACT TGATGAAATC 5100 

CAGCCAAGTC TATTGTGAAA TCAATGCATC CCAACATGGA AAGGAGGACT AGAATGAAAC 5160 

GACAAACTGT AAACCAGACG CTCAAACGTT TAGCCGTAGA TTTAGCAAGC CATCCTTTCC 5220 

TCCTTTTCCT AGCCTTTCrTA GGAACTATTG CCCAAGTTGG CTTATCAATT TACCTACCTA 5280 

TTCTGATTGG GCAGGTCATT GACCAAGTCC TAGTGGCTGG TTCATCACCA GTTTTTTGGC 5340 

AGATTTTTCT CCAGATGCTC TTGGTGGTAA TAGGTU^TAC TCTGGTACAA TGGGCCAATC 5400 

CTCTCCTCTA TAATCGTCTA ATCTTCTCTT ATACCAGAGA TTTACGGGAG CGAATCATCC 5460 

ATAAGCTCCA TCGTTTACCG ATTGCCTTTG TAGATAGGCA AGGTAGTGGA GAGATGGTTA bS20 

GTCGTGTAAC CACGGACATC GAACAGTTGG CAGCTGGCTT GACCATGATT TTTAACCAAT 5580 

TTTTCATTGG TGTTTTGATG ATTTTGGTCA GTATTCTAGC CATCCTCCAA ATTCATCTCC 5640 

TCATGACTCT CTTAGTCTTG CTGTTGACGC CACTGTCCAT GGTGATTTCA CGCTTTATTG 5700 

CCAAGAAATC CTATCATCTC TTCCAGAAGC AAACAGAGAC GAGGGGAATT CAGACTCAGT 5760 

TGATTGAAGA ATCGCTTAGT CAGCAGACTA TAATCCAGTC CTTCAATGCT CAAACAGAAT 5820 

TTATCCAAAG ATTGCGTGAG GCTCATGACA ACTACTCAGG CTATTCTCAG TCAGCCATCT 5880 
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TTTATTCTTC AACGGTCAAT CCTTCGACTC GCTTTGTAAA TGCACTCATT TATGCCCTTT 5940 

TAGCTGGAGT AGGAGCTTAT CGTATCATGA TGGGTTCAGC CTTGACCGTC GGTCGTTTAG 6000 

TGACTTTTTT GAACTATGTT CAGCAATACA CCAAGCCCTT TAACGATATT TCTTCAGTGC 6060 

TAGCTGAGTT GCAAAGTGCT CTGGCTTGCG TAGAGCGTAT CTATGGAGTC TTAGATAGCC 6120 

CTGAAGTGGC TGAAACA6GT AAGGAAGTCT TGACGACCAG TGACCAAGTT AAGGGAGCTA 6180 

TTTCCTTTAA ACATGTCTCT TTTGGCTACC ATCCTGAAAA AATTTTGATT AAGGACTTGT 6240 

CTATCGATAT TCCAGCTGGT AGTAAGGTAG CCATCGTTGG TCCGACAGGT GCTGGAAAAT 6300 

CAACTCTTAT CAATCTCCTT ATGCGTTTTT ATCCCATTAG CTCGGGAGAT ATCTTGCTGG 6360 

ATGGGCAATC CATTTATGAT TATACACGAG TATCATTGAG ACAGCAGTTT GGTATGGTGC 6420 

TTCAAGAAAC CTGGCTCACA CAAGGGACCA TTCATGATAA TATTGCCTTT GGCAATCCTG 6480 

AAGCCAGTCG AGAGCAAGTA ATT6CTGCTG CCAAAGCAGC TAATGCAGAC TTTTTCATCC 6540 

AACAGTTGCC ACAGGGATAC GATACCAAGT TGGAAAATGC TGGAGAATCT CTCTCTGTCG 6600 

GCCAAGCTCA 6CTCTTGACC ATAGCCCGAG TCTTTCTGGC TATTCCAAAG ATTCTTATCT 6660 

TAGACGAGGC AACTTCTTCC ATTGATACAC GGACAGAAGT GCTGGTACAG GATGCCTTTG 6720 

CAAAACTCAT GAAGGGCCGC ACAAGTTTCA TCATTGCTCA CCGTTTGTCA ACCATTCAGG 6780 

ATGCGGATTT AATTCTTGTC TTAGTAGATG GTGATATTGT TGAATATGGT AACCATCAAG 6840 

AACTCATGGA TAGAAAGGGT AAGTATTACC AAATGCAAAA AGCTGCGGCT TTTAGTTCTG 6900 

AATAAGCCAT TCTCTTTTGA AAGTTTATGG ACGAAAAAAG TTGCCTTCGA GTGACTTTTT 6960 

TGTTACAATA GCTAGAAAAA TTGTTCACTG TAATACTCAA TGAAAATCAA AGAGCAAACT 7020 

AGGAAGCTAG CCGTAGGTTG CTCAAAGCAC AGCTTTGAGG TTGTAGATAA GACTGACGAA 7080 

GTCAGTTCAA AACACTGTTT TGAGGTTGCA GATAGAACTG ACGAAGTCAG CTCAAAACAC 7140 

TGTTTTGAGG TTGCAGATAG AACTGACGAA GTCAGCTCAA AACAGG 7186 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

CTGAAAATTC TAAAAAATTT ATAAGTAAGG AATTAATTAG TTATTTTTGT GATAAAGTTT 60 
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ATGATGAAAT 


ATTTGTTGAA GAGGTAGTTC 


CGCACGTTTT 


TCTGCCATAT GAATCTGACT 


120 


TACTTCTTAT 


TTTACCAGCT ACGGCAAATG 


TGATTGGCAA 


AATTGCTAAT GGTATTGCTG 


180 


ATGATTTAGT 


TACAGCAACT GTTTTAAACT TTAATAAAAA AATAATTTTT TGTCCCAATA 


240 


TGAACTCTAC 


TATGTGGGAC AATCACATAG 


TTCAAAGAAA 


TGTATCAATT CTAAAGGAGT 


300 


TGGGACATAT 


ATTTTTATTT GAGTCTAAAA 


AAACATATGA 


GGTAGGATTG CGTAAAGCAA 


360 


TAGATTCAAC 


ATGTTCAATG TTACAACCAC 


AGTCGTTAGT 


AAAAGAACTT ATCAAATTAG 


420 


AAAATATTGT 


CCTTGAAGAG GGACATTAAA AACTACTGAG 


AATATTAATG AGGGGAAAAA 


480 


ATGGAAAATT 


CATCAATC6A TGTAGATATG 


CTGTTGGAAG 


AAT7GACACA AGAAGCAATG 


540 


GTCGTTGTTG 


CTGTTGATAA G6ACTGTTAA 


TTTAAACTTA 


TGGCAATATA TGAAAGGTTA 


600 


CTGGATGTTT 


TAAATTATGC AGGCAGTAGC 


CTTTTATTAT 


ATACAAATGG ATAAAGTAAG 


660 


GATAATACAA 


TGATTAATAA AAAAATACAA CAAGTTGTTT 


TGGAATCATT ACAGAATTTT 


720 


TTGAATGGGA 


ACTTCATTTC GCCTTGTGTA 


GTCTATGATT 


TTGGCTTGCT GGAAACTGTA 


780 


CTTGATGAAT 


TTAAAAATCA AATTCCTGTA ACATTCAATT 


ACCAACTTTT TTATGCCGTT 


840 


AAAGCAAATT 


CAAATGAGAA GATACTTGAA 


TTCTTAGTAG 


ATAAAATTGA TGGAGTTGAT 


900 


GTGGCGTCAT 


TATCTGAATT AGATGTGGCT 


AAAAAATTTT 


TCCCACCAAC TCAAATTTCT 


960 


GTTAATGGTC 


CCGCATTTTC TTATGAAACT 


TTATATAATC 


TGATTAAAAA ACAATATAAA 


1020 


GTTGATATTA 


ACTTTTTGGA ACATCTTCAA 


CAATTTTCCC 


CAAAAGAATC T6TTGGAATA 


1080 


AGAGTAACGG 


AGCCAGATGA ACTTAATAAT 


CGTATGAGTC 


GATTTGGAAT AAATATTTGC 


1140 


AGTGATAATT 


GGACTAGTAA TTTACAAAAT 


CCTTTAATTA 


CACGACTGCA TTTTCATTTT 


1200 


GGAGAAAAAG 


ATGATAAATT TATTGTTAAG 


TTAGATAAAA 


TATTATTTAA GTTACAAGAA 


1260 


ATTAATAAAC 


TTAGAGAG6T TAGAGAAATA 


AATCTTGGA6 


GCGGTTTTAT GAAATTATTT 


1320 


ATGGAAAATC 


GTTTGAAAGA ATTTTTTCTA 


TCACTTATGG 


AAATCTATAA AAAGTAOGAT 


1380 


ATTGATAGTA 


CTGTGACTAC AATAATAGAA 


CCAGGTAGTG 


CAATTACTTC ATTTTCTGCC 


1440 


TATATGATTA 


CTAGCCCAGT TAATGTTAGT 


GAGGTGAATG 


AGCAGCAGGT TATCACGTTA 


1500 


GACACATCAA 


TATACACCAA TACATTATGG 


TTTGTTCCGC 


ATATTATTAC AACGTTAAAT 


1560 


TCAAGTAGTA 


AAGAGCGTTA TAGTACTATT 


CTCTATGGTA ATACCTGTTA TGAACATGAC 


1620 


AAGTATAAAA 


TGMAGTTTC GCTTCCAAGG 


TTAACTCAAA 


ATAGCAGTAT AGTGTTTTTT 


1680 


CCTGTAGGAG 


CTTATATAAA AAGCAATCAT TCAAATTTAC ATCGTAATGA TTTTATGCGG 


1740 


GAGGTATATT 


TGTGGACAAA AAACTTGACA TATTAGATAA AGT^AAGGAA TATTTAGGAA • 


1800 


ATAAAACTAC 


TCAAATTCTG GATAATCAAT ATAAAGAATT 


TTTGAAACTT AATGATATAA 


1860 
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GGCGAGCGTT TGGTATTTCA GAAAAAGTAT TAAACAATTC TTTTAATTTT ACGAGTAAAG 1920 

AATTTAATGA .TTTAATTAAT AACX5AAAATT ATTTATTCGA ATATGCATGT AGAATTAGAG 1980 

AGGAATGGAG AAAAAAATGC TTTAATCATT CTTATCGTTT TCTATGCTCA CCTATAATTA 2040 

CAGATGATTT TCTTAACACX5 AAGACATTGA GAAGTAGCCA AATTGAATAT AAATATGAGC 2100 

GATATTTATC GAAAAGTTCG ATAGGCGATA GAGCGGTTGA TGGCTTTGTT TCCTTCAATA 2160 

CTTTAACAGC TAATGGTATG TCTGCTATTA AACTATGTCT TGAGATATTA AACTCTATTT 2220 

TCTTCAAGAA GAAGATTGAT TTATTATATT CAACCGGATA TTATGAAACA AGATTTTTAT 2280 

TAAATAATCT TGCTAAATCA GGTATTAGTT GCTATGAGGT AAGTAATTGT GAATTGGATA 2340 

AAGATAAATT TTATAATGTA TTCATGAT6G AACCCAATCG AGCCGATTTA ACATTACAAA 2400 

T^CTGATTT CAAGATAGTA GAATATTTTG TTAAGTATAA AAATAATTCA ATAAAAGTCG 2460 

TTATTTTAGA TATTTCATAT CAAGGTTCTA ATTTTAAATT AGTAGAATTT TTAGAGAAAT 2520 

TTAAATTTGC GAATGTAATT ATTTTTGTGG TACGATCTTT GATAAAATTA GATCAAATGG 2580 

GATTA6AATT 6ACAAATGGG GGAATAATAG AAGTGTTTAT TCCTAATCAT TTGAGAAAGT 2640 

TGAAAAATTT TATTGAAGAG GAATTCAATA AATTTAGAAA TTCTCACGGA GCTAATCTAA 2700 

GCCTCTATGA ATACTGTTTG CTTGATAATT CTTTAACTTT AAAAAATGAT TGGAACTATT 2760 

CTGATTTAGT TATGAAATTT ACGAGTAATT TTTATGCTGA TATAAAAGAC TTGTTCATGG 2820 

AAAATTCTGA TATTGAAATC ATCCATGAAG AGGGAGTACC TTTTGTATTT TTAGATTTAA 2880 

TAGGTGAAGG TAAAAAAGAA TATGAAATGT TTTTTCAATG GTTAAACTTC TTTTACAAAC 2940 

AGCTTGGAAT CACATTGTAT GCTAGAAATA GTTTTGGGTT TCGGAATCTA ACAGTAGAGT 3000 

ATTTTGGAAT TATTGGGACA GAAAGATATA TATTTAAGAT TTGTCCAGGT GTTTATAAAG 3060 

GGTTAAGTTA TTATTTGATG AAATTTTTAT TAAAATCTTT TTCAAATGAA TATTTAAAAA 3120 

CTACTGATGA GGTTAATAGA TGAAAAATTT GATAAAGTTG CTAATAATTA GATTGATTGT 3180 

TAACTTAGCA GACAGTGTAT TTTATATAGT AGCATTGTGG CACGTTAGCA ATAATTATTC 3240 

TTCGAGCATG TTCTTAGGAA TATTTATTGC AGTAAATTAT CTACCGGATT TGTTACTAAT 3300 

CTTTTTTGGA CCAGTTATTG ACAGAGTAAA TCCGCAAAAA ATTCTTATAA TATCAATTTT 3 360 

GGTTCAATTA GCAGTGGCTG TAATATTTTT ATTATTATTA AACCAAATAT CATTTTGGGT 3420 

GATAATGAGT CTAGTGTTTA TTTCAGTAAT GGCTAGCTCC ATAAGTTACG TGATAGAAGA 3480 

TGTGTTGATT CCTCAAGTGG TAGAATATGA TAAGATTGTA TTTGCAAATT CTCTTTTTAG 3540 

TATTTC6TAT AAAGTATTAG ATTCTATTTT TAATTCATTC GCATCATTTT TACAGGTGGC 3600 
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AGTAGGATTT ATTTTATTGG TTAAGATAGA TATAGGCATA TTTTTACTTG CrCTATTTAT 3660 

ATTGTTGTTG TTAAAATTTA GAACTAGCAA TGCGAATATA GAAAACTTCT CTTTCAAATA 3720 

TTACAAGAGA GAAGTGTTGC AAGGTACAAA GTTTATTTTA AATAATAAAT TATTATTTAA 3780 

AACCAGTATT TCTTTAACGC TTATAAACTT TTTTTATTCA TTTCAGACAG TAGTTGTACC 3840 

GATTTTTTCT ATTCGATATT TTGATGGTCC GATTTTTTAT GGTATTTTTT TAACTATTGC 3900 

TGGTTTGGGT GGTATATTGG GAAATATGCT AGCGCCAATC GTAATAAAAT ATTTAAAATC 3960 

6AATCAAATT GTTGGTGTAT TTCTTTTTTT GAACGGCTCA AGTTGGTTAG TAGCAATTGT 4020 

TATAAAAGAC TATACTTTAT CACTTATTTT ATTTTTCGTP TGTTTTATGT CTAAAG6AGT 4080 

CTTCAATATT ATTTTTAATT CGTTGTACCA ACAAATACCT CCACATCAAC TTCTTGGTAG 4140 

GGTAAATACT ACCATTGATT CTATTATTTC TTTTGGAATG CCAATTGGTA GTTTAGTTGC 4200 

AGGAACGCTT ATTGATTTGA ATATTGAATT AGTGTTAATT GCTATTAGCA TACCTTATTT 4260 

TTTGTTTTCT TATATTTTTT ATACGGATAA TGGATTGAAA GAATTTAGTA TATATTAGAA 4320 

ATGTTTATGT TCATTCAAAA GCATAATGAC TATAACTGAA AAAGAAAAGT GATATCTTTA 4380 

AGGTTGTTCT TCTTGGTGGT GAGATTCGTG AGACAACCCA AGCTTTTGTC GGAAAGATTA 4440 

CCAATGCTTT GATGGATAGG ATGTACTTTA GCAAGATGTT TTTAGTGGTA ACGGTATCGT 4500 

GGATGGACGT GTAATAACCT CTTCTTTCGA GGAGTATTTT ACTAAAAAAC TAGCCTTGGA 4560 

GCGTTCCCCA GAAACGGACT TACTCATTGA CTCTTCAAAG ATTTGGGGAG AAGATTTTGC 4620 

TTCATCTGTT CCTTGAAAAA AGTCACAGCA GTCATCACAG ACGATAGTAC TGAACAAAAC 4680 

TATGAAGAGT TAGAAATTTA TAC6CAGGTG ATTGTATAAA GGATCTGGAA ATAGATAAGA 4740 

AGTTGATTAG TATTGACCTA GGTGGTACAA ATATTAAGAT TACTGTTCTT TCAAATGACG 4800 

GTGAGATTGA AACTTTGTGG AGTATTACAA CAGATACAAG TGAGAAAGGT TCTCAAATTA 4860 

TATCGGACAT CATCAGTTCT ATTAAAAATA AATTGACCGA ACGGAATATT CCTGATAGCG -4920 

ACCTTCTTGG AATCGGTATG GGAAGTTGCT CATCATACTT TCCTTGTAAA TCATAGGGGC 4980 

TATAAACTCT CCGTCTACTT GTCCTGCAAC AATTGAAGTC TGCTCAAAAC GCCGTCCGCT -3040 

AATCTTTTCA TAGACTTTCT CCCTTTTAGG A6CCTAGCTT TCTAGTTTGT TCTTTGATTT 5100 

TTATTGAGTA TACCACTATT TTACTCCCTC TGGCAAGGGA CTTTGTCTAT GTGGAGGGAT 5160 

TGGGCTCCTA TGTGGTGGAG CTTTTCTGTT CTTTCTGAAA TATGGTATAA TAGCACTAAT 5220 

CAATTTCTAG GAAAATAGAT ACAGAAAGGG GCTGAAAGAT GTCTCATATT ATTGAATTGC 5280 

CAGAGATGCT GGCAAACCAA ATCGCX36CTG 6AGAGGTCAT TGAACGTCCT 6CCAGTGTGG 5340 

TCAAAGAGTT GGTAGAAAAT GCCATTGACG CXK»3CTCTA6 TCA6ATTATC ATTGAGATTG 5400 



wo 98/18931 



PCT/US97/19588 



393 



AGGAAGCTGG 


TCTCAAGAAG 


GTTCAAATCA 


CGGATAACGG TCATGGAATT GCCCACXyiTG 


5460 


AGGTGGAGTT 


GGCCCTGCGT 


CGCCATGCGA 


CCAGTAAGAT 


AAAAAATCAA GCAGATCTCT 


5520 


TTCGGATTCG 


GACGCTTGGT 


TTTCGTGGTG 


AAGCCTTGCC 


TTCTATTGCG TCTGTTAGTG 


5580 


TCTTGACTCT 


GTTAACGGCG 


GTGGATGGTG 


CTAGTCATGG 


AACCAAGTTA GTCGCGCGTG 


5640 


GGGGTGAAGT 


TGAGGAAGTC 


ATCCCAGCGA 


CTAGTCCTGT 


GGGAACCAAG GTTTGTGTGG 


5700 


AGGATCTCTT 


TTTCAACACG 


CCTGCCCGTC 


TCAAGTATAT 


GAAGAGCCAG CAAGCGGAGT 


5760 


TGTCTCATAT 


CATTGATATT 


GTCAACCGTC 


TGGGCTTGGC 


CCATCCTGAG ATTTCTTTTA 


5820 


GCTT6ATTAG 


TGATGGCAAG 


GAAATGACGC 


GGACAGCAGG 


GACTGGTCAA TTGCGCCAAG 


5880 


CAATCGCAGG 


GATTTACGGT 


TTGGTCAGTG 


CCAAGAAGAT 


GATTGAAATT GAGAACTCTG 


5940 


ACCTAGATTT 


CGAAATTTCA 


GGTTTTGTGT 


CCTTGCCTGA 


GTTGACTCGG GCTAACCGCA 


6000 


ATTATATCAG 


CCTCTTCATC 


AATGGCCGTT 


ATATTAAGAA 


CTTCCTGCTC AATCGTGCTA 


6060 


TTTTGGATGG 


TTTTGGAAGC 


AAGCTTATGG 


TTGGACGTTT 


TCCACTGGCT GTCATTCACA 


6120 


TCCATATCGA 


CCCTTATCTA 


6CGGATGTCA 


ATGTGCATCC 


AACTAAGCAA GAGGTGCGGA 


6180 


TTTCCAAGGA 


AAAAGAACTG 


ATGACTCTGG 


TTTCAGAAGC 


TATTGCAAAT AGTCTCAAGG 


6240 


AACAAACCTT 


GATTCCAGAT 


GCCTTGGAAA 


ATCTTGCCAA 


ATCGACCGTG CGCAATCGTG 


6300 


AGAAGGTGGA 


GCAAACTATT 


CTCCCACTCA 


AAGAAAATAC 


GCTCTACTAT GAGAAAACTG 


6360 


AGCCGTCAAG 


ACCTAGTCAA 


ACTGAAGTAG 


CTGATTATCA 


GGTAGAATTG ACTGATGAAG 


6420 


GGCAGGATTT 


GACCCTGTTT 


GCCAAGGAAA 


CCTTGGACCG 


ATTGACCAAG CCAGCAAAAC 


6480 


TGCATTTTGC 


AGAGAGAAA6 CCTGCTAACT 


ACGACCAGCT 


AGACCATCCA GAGTTAGATC 


6540 


TTGCTAGCAT 


CGATAAGGCT 


TATGACAAAC 


TGGAGCGAGA 


AGAAGCATCC AGCTTCCCAG 


6600 




TTTCGGACAA ATGCACGGGA 


CTTATCTCTT 


TGCCCAAGGG CGAGATGGAC 


ODDu 


TTTACATCAT 


AGATCAGCAC 


GCTGCTCAGG 


AACGGGTCAA 


GTACGAGGAG TACCGTGAAA 


6720 


GCATTGGCAA 


TGTTGACCAA 


AGCCAGCAGC 


AACTCCTAGT 


GCCCTATATC TTTGAATTTC 


6780 


CTGCGGATGA 


TGCCCTGCGT 


CTCAAGGAAA 


GAATGCCTCT 


CTTAGAGGAA GTGGGCGTCT 


6840 


TTCTAGCAGA 


GTACGGAGAA 


AATCAATTTA 


TTCTACGTGA 


ACATCCTATT TGGATGGCAG 


6900 


AAGAAGAGAT 


TGAATCAGGC 


ATCTATGAGA 


TGTGCGACAT 


GCTCCTTTTG ACCAAGGAAG 


6960 


TTTCTATCAA 


GAAATACCGA 


GCAGAGCTG6 


CTATCATGAT 


GTCTTGCAAG CGATCTATCA 


7020 


AGGCCAATCA 


TCGTATTGAT 


GATCATTCAG 


CTAGACAACT 


CCTCTATCAG CTTTCTCAAT 


7080 


GTGACAATCC 


CTATAACTGT 


CCTCACGGAC 




GGTGCATTTT ACCAAGTCGG 


7140 
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ATATGGAAAA GATGTTCCGA CGTATTCAGG AAAATCACAC CAGTCTCCGT GAGTTGGGGA 7200 

AATATTAAAA GTATAAAAAA GTCTGGGAAA AATTTTCAAA ATCAAAAAAA CGCATAAAAT 7260 

CAGGTGTTCA AAAACCTTGA TTTTATGCGT TTTATCATGG AAATAGTTAC TTCATTTTTT 7320 

CCTAATTCTT TTCGAAACTC TTTTTAAACG ACGTCAGTTT TATCAGTAAT CTCAAAACAG 7380 

TGTTTTGAGC TAATTTTGCC AGTTTTGTCT GTAACATCGA AGTTGTGTTT TACCACTCTG 7440 

CGACTGGTTT CCTAGTTTGC TCTATGATTT TCACAGAGCA TTAAATTGCG ATTTTGCCAA 7500 

GTTTCTTTAT TCGTCTAAAA GTAGAGTCTG TTCTATGCGT CTAATGTACG AATCAGGTTG 7560 

ACCATTTCAA TAGCTCCTTG TGCACACTCA GAACCCTTAT TTCCTGCTTT AGTACCAGCT 7620 

CGTTCTATGG CTTGTTCAAT TGTATCTGTC GTTAGCACAC CAAACATAAC AGGAATTTCG 7680 

CTATTTAAAC TGATTTGGGC GATTCCCTTA GATACCTCGC TACATACATA ATCATAATGA 7740 

CTTGTATTCC CTCTAATGAC AGCTCCCAAG CA6ATAATTG CATCATATTT TTTACTTTTT 7800 

GCCATTTTTG ATGCAATCAG TGGTATTTCA AAAGCTCCTG GAACCCAGGC TACCTCTATA 7860 

TCTTTCTCGT TTACATTCTC TCTTTTGAGA TTATCTAGTG CTCCAGATAA TAATTTTGAA 7920 

GTTATAAATT CATTAAATCT CGCTACAACA ATACCTATTT TAATATTGTT TGCTACTAAA 7980 

TTACCTTCAT AAGTGTTCAT TTATTTTTCC TCCATATTTA AAATGTGACC CATTCGATTT 8040 

TTCTTTGTTT CTAAATAAAA ACTATCGTAA GGATTGGCTT CTATTTCGAT TGATATTCTA 8100 

CTGGAAATGG TAATTCCATA TTTTTCTAAC TGTTCAACCT TGTCAGGATT ATTTGTCAGT 8160 

AAATGAAGTG ACTGAAGTCC CAGATCTTTA AGCATTTTTG CTCCAATATG ATATTCTCTT 8220 

AAATCACCTT CAAAGCCTAA TGCAAGATTG GCATCAAGCG TATCCATGCC TTGATCTTGT 8280 

AAATGATAGG CTTTTAATTT ATTGATAAGT CCAATTCCTC GTCCCTCCT6 TCGCAAGTAA 8340 

AGTAAGACAC CCGAACCATT CTCAACAATC ATTTTCATAG CCTTATCGAA TTGCTGTCCA 8400 

CAATCGCAAC GTAAAGAGCC TAAAACATCT CCTGTTAAAC ATTCGGAGT6 GACCCGACAT 8460 

AATACATTGG CTTCATCCTC TATATTTCCC ATAATAAGAG CAAGATGATG TTCCCCATTT 8520 

AGTTTATCTA TATAGCTAAT TGCTTTGAAA TTACCGTATC TAGTAGGCAT ATTGACAGTT 8580 

GAAACTCGTT CTACCAGCTG ATCATATACT TTTCTATATT CTTGTAATTC TTTGATGGTA 8640 

ATTAGTGGAA TGTTQTGTTT TTTCGAGAAC TGAATTAAAT CATCTGTTCT CATCATTTTG 8700 

CCATCATGAT TCATTATTTC ACAACATAGG CCACACTCTT TTAGTCCAGC TAATTTTAAT 8760 

AAATCAACAG TTGCTTCTGT GTGTCCATTT CTTTCTAGGA CACCACCTTT TTTTGCAATT 8820 

AAAGGAAACA TGTGTCCTGG CCTGCGAAAA TCAGAGGGTG TTATATCTTC AGCTACACAC 8880 

ATACGTGCGG TCAGTCCTCT TTCCTCGGCA 6AAATACCTG TGGTCGTTTC TTTATAATCA 8940 
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ATTGAAACTG 'TAAAAGCAGT 


CTTATGATTA 


TCTGTATTGT 


TTTCAACCAT 


AGGTGAAAGC 


9000 


ATTAATTGAT 


TAGCTAAACT 


TTCGCTCATA 


GGCATACAAA 


TTAATCCTTT 


GGCATAAGTA 


9060 


GCCATAAAAT 


TAACATTTTC 


TGTTGTAGCT 


GCTTGTGCAG 


AACAAATTAA 


GTCTCCTTCA 


9120 


TTTTCTCTAT 


CCTTGTCGTC TATAACAAGA ACAAGTCGTC CCTTCTGCAA TGCTTCTAAT 


9180 


GCTTCTTGTA TTTTTCGATA TTCCATTGAC TGATTATCCT TTCTGCTAAA ATCCATTTTG 


9240 


ATATAATAGT 


TCCTTAGATA 


TTTCTGATTT 


TGGAGAGTTA 


TCCATCAGTT 


TTTGCACATA 


9300 


TTTACCTAAG 


ATATCATTTT 


CAAGATTTAC 


TGTACTCCCG 


ACTTGTTTAC 


TCTTAAGAAT 


9360 


GGTTTGTTCC 


AAGGTATGAG 


GGATAACAGA 


TACTGAAAAG 


TTTACTTTGG 


AGACTTTAGC 


9420 


GACAGTCA6A 


CTAATGCCGT 


CAATTGTAAT 


AGATCCTTTT 


TCAACTATTA 


AATCTAAAAT 


9480 


TTCTTTTTGT 


GTGTTGATTT 


GATACCATAC 


AGCATTATCA TCTTTTTTTA 


TTGACGAGAT 


9540 


TTTTCCTGTA 


CCATCAATGT 


GTCCTGTAAC 


GACGTGACCC 


CCAAGTCGAC 


CGTTGACAGA 


9600 


TAAGGCTCTT 


TCTAGATTCA 


CCTCACTTCC 


ATGTTTTAAT 


AGAGTAAGAG 


CTGTTCGACT 


9660 


CCATGTTTCA 


TTCATTACAT 


CAACTGTAAA 


GGATTGATGA 


TTGAAATGAG 


TAACTGTAAG 


9720 


ACAGATACCA TTTACTGCTA 


TACTATCGCC 


TAAATGGATA 


TCCGTTAATA 


TTTTTGAGGC 


9780 


TTTAATTGAT 


AGTTTACAAT 


TACGAGAGTC 


TTTCTGTATT 


CTTTCAACTT 


TTCCGATTTC 


9840 


TTCAATTATT 


CCTGTGAACA 


TGGATAAATC 


ACTTCACTTT 


CTATGAGATA 


GTCATTTCCT 


9900 


ATTTGAGAAA 


ATGCATAAGG 


TTTCAATCTA 


ATAGCGTCAT 


TTGGCAAAGA 


AATACCTTCA 


9960 


CCTCCGACAG 


GAAACTTGGC 


ACTACCTCCA 


AAAACTTTTG 


GTGCAATATA 


TATTTTCAGC 


10020 


TCATCAACAA 


TTTGTTGTTC 


CAAAGCACTC 


CAATTCATTA GACTGCCCCC 


TTCTAGAACT 


10080 


AGGCTATCAA 


TCTGCATGTT 


TCCTAGATGT 


TGCATTAAAC 


TCGATAAGTC 


TATATGATTG 


10140 


CCTTTTTTCT 


TTATGGAAAG 


TATTTCACAG 


CCATGATTTT 


GATATAGCTT 


CATTTTATTT 


10200 


TTGTCTTCAG 


A6GAAGTGGC 


AATGTAAGTT 


TTAATATCAT 


TTGCTGTTTT 


TACGATTTTA 


10260 


GAGGTAA6AG 


GA6TTCGTAA 


ATGTGTATCG 


CATATGATAC 


GGATAGGATT 


TTTCCCTTCC 


10320 


TCCAATCTAC 


ATGTCAGCAA 


AGGATCGTCT 


TGAATAACAG 


TATTGACTCC 


CACCATAATT 


10380 


GCACTAACAT 


GGTGTCGTAA 


CTGATGCACA 


TGCTTTCTTG 


CTTCTTCTTC 


AGTAATCCAT 


10440 


TTGGATTGAT 


TTGTTTTAGT 


GGCTATTTTT 


CCATCCATTG 


ACATTGCATA 


TTTCATAAAA 


10500 


ACATAGGGTA 


CATGCTGGGT 


AATATACTTT 


CTAAAACTTT 


TTATTAAGTT 


AAGACACTCA 


10560 


TTTTCTAAAA TTCCAACAGT AACTTGAAGA TTATTTTCCT CAAGTATCTT 


TACTCCTTTT 


10620 


CCAGATACAA TAGGATTACA GTCTAGGCTT CCAATGACTA CTCTTGTAAT ACCACTATCG 


10680 
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ATTATAGCAT CTATACAGGG AGGTGTTTTC CCGAAGTGAC AACAGGGTTC AAGTGTTACA 10740 

TAAAGCGTCG CTCCGACAGG GGATTCTCTA CAGTTTTTAA GAGCATTTCT CTCAGCATGT 10800 

GGGCCACCAA AAAACTCATG ATAACCTTGT CCGATAATGT GATTATCTTT TACAATAACT 10860 

GCGCCGACCA TAGGATTGGG ATTGACGTAA CCAGCCCCTT TTTGTGCCAG TTTTATTGCT 10920 

AATTTCATAT ATTTTGAATC GCTCATCTCG CTACCTCCAA AAAAATATAC CTTGAATAGG 10980 

GGACTACTCA AGGCATACAA AAGAAAACTT ATGCGATTAA CAAAAATGCT CTGAAATGAC 11040 

AAGTAATCAT TTCAGAGCAC GCAAAAAGCA CAAATATACT TTTATCTTCT TTCATCCAGA 11100 

CTATACTGTC GGCTTTGGAA TTTCACCAAA TCATGCCTTT CGGCTC6TGG 6CTATACCAC 11160 

CGGTAGGGAA TTTCACCXTTG CCCTGAAGAT AGTTATTCAA TTACAGATGA TTATAGTACT 11220 

TAATTTTGAA TATGTCAACA GATAAATACC GATTGTTTTT GATATACTGT ATTTGTGATA 11280 

ATCGATTCTC GCTCCTOGGA TAAAGAAAAT ATGATATACT AGATAAACGA AATAAGAGAG 11340 

AAGGAATACT ATGTACGCAT ATTTAAAAGG AATCATTACC AAAATTACTG CCAAATACAT 11400 

TGTTCTTGAA ACCAATGGTA TTGGTTATAT CCTGCATGTG GCCAATCCTT ATGCCTATTC 11460 

AGGTCAGGTT AATCAGGAGG CTCAGATTTA TGTGCATCAG GTTGTGCGTG AGGACGCCCA 11520 

TTTGCTTTAT GGATTTCGCT CAGAGGATGA GAAAAAGCTC TTTCTTAGTC TGATTTCGGT 11580 

CTCTGGGATT GGTCCTGTAT CAGCTCTT6C TATTATCGCT 6CTGATGACA ATGCTGGCTT 11640 

GGTTCAAGCC ATTGAAACCA AGAACATCAC CTACTTGACC AAGTTCCCTA AAATTGGCAA 11700 

GAAAACAGCC CAGCAGATG6 TGCTGGACOT GGAAGGCAAG GTAGTAGTTG CAGGAGATGA 11760 

CCTTCCTGCC AAGGTCGCAG TGCAAGCAAG TGCTGAAAAC CAAGAATTGG AAGAAGCTAT 11820 

GGAAGCCATG TTGGCTCTGG 6CTACAA66C AACAGAGCTC AAGAAAATCA AGAAATTCTT 11880 

TGAAGGAACG ACAGATACAG CTGAGAACTA TATCAAGTCG GCCCTTAAAA TGTTGGTCAA 11940 

ATAGGAGCAG AGAATGACAA AACGTTGTTC GTGGGTCAAG ATGACCAACC CGCTCTACAT 12000 

CGCCTATCAT GATGAGGAGT GGGGCCAGCC CCTCCATGAT GACCAAGTAT TGTTTGAGIT 12060 

GTTGTGTATG GAAACCTATC AGGCAGGCCT GTCTTGGGAA ACGGTACTCA ACAAACGCCA 12120 

AGCTTTCCGA GAAGTCTTTC ATAGCTATCA AATTCACTCA GTCGCAGAGA TGACTGACAC 12180 

TGAATTGGAA GCCATGCTGG AGAATCCAGC TATCATTCGA AATAGAGCCA AGCTTTTTGC 12240 

TACACGCGCT AACGCCCAAG CCTTTCTACA GTTACAGGCA GAGTACGGCT CTTTTGATGC 12300 

CTATCTTTGG TCTTTTGTTG AGG6GAAAAC TGTCGTTAAC GATGTTCCTG ATTATCGCCA 12360 

AGCGCCAGCT AAAACACCXn^ TATCTGAGAA ATTAGCCAAA GATCTCAAAA AACX3AGGCTT 12420 

CAAGTTCACA GGCCCAGTC6 CCGTATTGrrC TTTTCTACAG GCTGCAGGGC TAGTTGATGA 12480 
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CCACGAGAAT GATTGTGAGT GGAAAGGTCT TAAATGATGT CT7WVCAAAAA TAAGGAAATT 12540 

CTGATTTTTG CGATTCTCTA TACAGTCCTC TTTATGTTTG ATGGCGTTAA ATTGCTGGCT 12600 

TCTTTAATGC CATCTGCCAT TGCAAATTAT CTTGTTTATG TAGTTTTAGC TCTATATGGC 12660 
TCCTTCTTGT TCAAGGATAG ATTGATCCAA CAATGGAAGG AGATTAGAAA GACTAAAAGA . 12720 

AAATTCTTCT TTGGAGTCTT AACAGGATGG CTCTTTCTCA TTCTGATGAC TGTTGTCTTT 12780 

GAATTTGTAT CAGAGATGTT GAAGCAGTTT GTGGGACTAG ATGGACAAGG TCTAAATCAG 12840 

TCTAATATTC AAAGTACCTT TCAAGAACAA CCACTACTGA TAGCTGTTTT TGCTTGTGTC 12900 

ATTGGACCTC TGGTAGAAGA ATTATTTTTC CGTCAGGTCT TATTGCATTA CTTGCAGGAA 12960 

CGGTTGTCAG GTTTACTAAG CATTATTCTG GTAGGACTTG TTTTTGCTCT GACTCATATG 13020 

CACAGTTTGG CTCTATCAGA GTGGATTGGT GCAGTTGGTT ACTTAGGTGG AGGCCTTGCC 13080 

TTTTCTATTA TTTATGTGAA AGAAAAAGA6 AATATCTACT ATCCCCTACT TGTTCACATG 13140 

TTAAGCAACA GCCTCTCCTT AATCATTTTA GCTATCAGTA TAGTAAAATG AAATGAGAAC 13200 

AGGACAAATC GATTTCTAAC AATGTTTTAG AAGTAGAGGT GTACTATTCT AGTTTCAATA 13260 

TACTGTAATA TGTGATGAAA ATGCCAGTAA TGATACCGAG AAAAAAGCTG AGAAACTTTT 13320 

CCCAGCTTTA TTTGTTATAG TCAAAGAGAA TGACTTGTTC CTGTGCATCT ACATGAGCAT 13380 

GGACCCCAAA GGGTACAATT GCTCTTGGAG TTGCGTGGCC GACATTCAGA TTATAGACAA 13440 

TCGGGATATT GCTGTCAATG ATATCCAATA GTGCCTCTTT ATAGTCGTCA TGGAAAGTTT 13500 

CATCCATAGG TTTTCCGACC AAGAGTCCAT TGATGACCGC GAATATGCCA GTGTCCTTTA 13560 

AAGTTAGCAA CATCTTTTTG AAGTCTTCTG GCTTAGGCTT TPCTTCGCTT GTTTCGAGCA 13620 

AGAGGATTTT CCCTTCCCAG TCTGACAAGT CAGGGAAAAG TTTGTATTTT TGGCAGAGTT 13680 

CCGTGCTATC TGCX3TATCGA GAGTTGTCAA A6ATATCGTA GAGGGATTCG AGGCAACCAC 13740 

CGAGGATTTT CCCCTCGAAC TGGGCACTTC CTTGCAACAA GTCAAAACCT GTATTTGTAT 13800 

GACTGACA06 AGGTGTTCCX: AGGGCCGTGG GACTAAAATC AGTTCGTTCC TCATACCAAA 13860 

CGTCACTAGG GCGGATTTCT GAAATTCTTC CCGTCTCAAT CAATTCTTTA AAGTAGTGAA 13920 

GGCTATAGGC TAGCATTTCT TTGTCTAATT CACAAATGTC TGCTAAAAAG GATTGACCAT 13980 

AAAAAGTCTT GATTCCTAAT TTATGCAACA TGAGGTGGTT CATGGTTGTA TCCGAGAAGC 14040 

CAAGAAAAAT TTTTTGCTTG ATAACCTTTT GGAGTTGGTC ATTTTCAAAA AGATAAG6TA 14100 

GCAAGCGATA 6GTATCGTCT CCACCGATGG CACATAGGAT CATGTCGATG CTATCATCAG 14160 

AAAAG6CATG AATCAAATCC TCTGCACGAG CTTCAGGATG GTCCTTGATA AAGTCTAATC 14220 
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CTTTTAACGA ATGGGGCAAA AAGATGGGAT TGGTCCCAGA TCCTTGAGAC GTT 14273 
(2) INFORMATION FOR SBQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9828 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



GTGAAGTGCG 


GCAAAAGGTG 


CAA6TGATGA 


GCTCAGGTTC 


TTTAGCTCTT 


GACATTGCCC 


60 


TTGGCTCAGG 


TGGTTATCCT 


AAGGGACGTA 


TCATCGAAAT 


CTATGGCCCA GAGTCATCTG 


120 


GTAAGACAAC 


GGTTGCCCTT 


CATGCAGTTG 


CACAAGCGCA 


AAAAGAAGGT 


GGGATTGCTG 


180 


CCTTTATCGA 


TGCGGAACAT 


GCCCTTGATC 


CAGCTTATGC 


TGCGGCCCTT 


GGTGTCAATA 


240 


TTGACGAATT 


GCTCTTGTCT 


CAACCAGACT 


CAGGAGAGCA 


AGGTCTTGAG 


ATTGCGGGAA 


300 


AATTGATTGA 


CTCAGGTGCA 


GTTGATCTTG 


TCGTAGTCGA 


CTCAGTTGCT 


GCCCTTGTTC 


360 


CTCGTGCGGA 


AATTGATGGA 


GATATCGGAG 


ATAGCCATGT 


TGGTTTGCAG 


GCTCGTATGA 


420 


TGAGCCAGGC 


CATGCGTAAA 


CTTGGCGCCT 


CTATCAATTVA 


AACCAAAACA 


ATTGCCATTT 


480 


TTATCAACCA 


ATTGCGTGAA 


AAAGTTGGAG 


TGAT6TTTGG 


AAATCCAGAA ACAACACCGG 


540 


GCGGACGTGC 


TTT6AAATTC 


TATGCTTCAG 


TCCGCTTGGA 


TGTTCGTGGT AATACACAAA 


600 


TTAAGGGAAC 


TGGTGACCAA 


AAAGAAACCA 


ATGTCGGTAA 


AGAAACTAAG 


ATTAAGGTTG 


660 


TAAAAAATAA 


GGTAGCTCCA 


CCOTTTAAGG 


AAGCCGTAGT 


TGAAATTATG TACGGAGAAG 


720 


GAATTTCTAA 


6ACTGGTGAG 


CTTTTGAAGA 


TTGCAAGCGA 


TTTGGATATT ATCAAAAAAG 


780 


CAGGGGCTTG 


GTATTCTTAC 


AAAGATGAAA 


AAATTGGGCA 


AGGTTCTGAG TU^TGCTAAGA 


840 


AATACTTGGC 


AGAGCACCCA 


GAAATCTTTG 


ATGAAATTGA 


TAAGCAAGTC 


CGTTCTAAAT 


900 


TTGGCTTGAT 


TGATGGAGAA 


GAAGTTTCAG 


AACAAGATAC 


TGAAAACAAA 


AAAGATGAGC 


960 


CAAAGAAAGA 


AGAAGCAGTG 


AATGAAGAAG 


TTCCGCTTGA 


CTTAGGCGAT 


GAACTTGAAA 


1020 


TCGAAATTGA 


AGAATAAGCT 


GTTAAAGCAG 


TGGAGAAATC 


CGCTACTTTT 


TC6ATTTTTG 


1080 


ATTCAAGTTT 


TTAGATTATA 


TATAGTAGCT 


TGAAATAAGA 


TATGAACAAC 


TCTATTAGGA 


1140 


AAGTCAAATT 


AATTTCTAGA 


AATGTTTTAG 


CAGCTACAGC 


GTACTATTCC 


AAACTCAACC 


1200 


AACTATAATA 


GATCGAAACT 


AGAATAGTAC 


ATATCTACTT 


CTAAAACATT 


GTTAAAAATC 


1260 


GATTTGACTT 


TCCTTATTTC 


ATTCCGCTAT 


ATATAGTTTG 


CTGTTTCTTG 


TCGCTCCTCT 


1320 


GGAAAGCTGA 


TATAATAGCT 


TTATGAATAA 


AAAACGAACA 


GTGGACCTGA TACATGGTCC 


1380 



